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CHAPTER I 


INTRODUCTION AND PRELIMINARIES 

In most situations the parameters of a statistical 
distribution are not known with certainty and must be 
estimated. In this dissertation these parameters are 
treated as random variables and empirical Bayes point 
estimates are given. In particular, the two-parameter 
Weibull distribution is considered. 

1. 1 Historical Background of the Weibull Distribution 

In 1939 a Swedish scientist, Waloddi Weibull, derived 
a statistical distribution with which his name has been 
associated in recent years. This derivation came about as 
the result of an analysis of breaking-strength data and can 
be found in [3^]. Weibull also published related papers 
[35], [37], and in [36] illustrates several examples of 
the distribution's practical value in analyzing various 
types of data. 

The wide audience these papers found among reliability 
engineers after World War II firmly attached the Weibull 
name to this statistical distribution. The distribution, 
however, was originally derived in 1928 by R. A. Fisher and 
L. H. C. Tippett [8]. Their derivation became known to 
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researchers who were familiar with extreme-value theory as 
the Fisher-Tippett Type III distribution of extreme values 
and as the third asymptotic distribution of extreme values. 

1 . 2 Brief Survey of Previous Research 

The problem of estimating the parameters in the 
Weibull distribution has received considerable attention 
in recent literature from several authors. The techniques 
proposed by these authors encompass a wide spectrum of 
statistical methods. These methods will be referred to 
as "classical" methods of estimation. 

Graphical techniques for grouped and ungrouped data 
have been proposed by Kao [15]. Best linear unbiased 
estimators (BLUE) were computed by Govinarajulu and 
Joski [ 9 ] using ordered observations for small sample 
sizes. White [33] obtained linear, unbiased, least 
squares estimators for the censored Log-Weibull 
Distribution. Gumbel [10], Menon [21], Miller and 
Freund [22], and Bain and Antle [1] all give simple 
estimators, that is, estimators which do not require 
tedious computations. Maximum-likelihood estimators 
which generally provide useful estimates have been 
considered by several authors; among them are Cohen [*+], 
Dubey [7], Harter and Moore [12], and Thoman et al. 

[31]. Although this by no means represents an 
exhaustive list of authors concerned with parameter 
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estimation in the Weibull distribution, it does point out 
the considerable attention that the distribution has 
received. 

1. 3 Purpose 

The type of decision problem to be considered in this 
dissertation can best be illustrated by an example. Con- 
sider the development program for a particular solid- 
propellant-rocket engine which must ''burn'’ for a specified 
time. In this program, certain points exist at which 
progress is monitored. For instance, the Pre-Flight 
Rating Test program would be one such point at the cul- 
mination of the initial R&D program, demonstrating the 
ability of a sample of engines to perform for a specified 
length of time. After this phase a new phase is entered in 
which flight and static tests are performed, and if needed, 
a more refined system configuration is developed. Finally, 
design is frozen, and a Qualification Test program is 
undertaken to demonstrate the suitability of the engine 
system. During this period in the program, several groups 
of engines are test-fired, and due to stringent reliability 
requirements, a large sample of engines is required. 

Throughout these development phases, it is quite 
possible that the form of the time-t o-f ailure distribution 
remains unchanged, that is, no significant design changes 
were incorporated which would greatly affect the overall 
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characteristic performance of the engines . In each of the 
experiments conducted throughout the total program, however, 
the classical estimates obtained for the unknown parameters 
varied unpredictably from experiment to experiment. Since 
no specific cause for this variation could be found, the 
variation was considered to be random. This random var- 
iation may have been caused by the interaction of the 
components comprising the engine system, by variation in 
the solid propellant mixing process, or by numerous other 
uncontrollable factors. 

If the researcher is using a classical estimation 
procedure during the Qualification Test program, he must 
choose a large sample size in order to meet the stringent 
reliability requirement. He would, of course, like to use 
the data obtained from the previous experiments. His pro- 
cedure, however, restricts him to the use of only the data 
in the present experiment. He could consider "pooling" 
all previous data to obtain point estimates for the 
parameters; however, he would be violating the basic 
principles on which classical methods are established and 
might obtain inaccurate results. 

To use these methods, it is necessary to assume that 
the data are obtained from the same specified distribution. 
Therefore, in order to pool data from previous experiments, 
the unknown parameters must remain constant throughout 
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each experiment. This is not usually the case in a devel- 
opment program; therefore previous data must be ignored 
when using classical techniques. 

Since the fluctuation in the parameters for the time- 
to-failure distribution can be attributed to random 
variation, data from previous experiments can and should 
be used to obtain point estimates in the present experi- 
ment. When the researcher knows the prior distribution 
describing this variation, the Bayes principle can be used 
to provide estimates for the parameters. In terms of the 
minimization of the overall expectation of some appro- 
priate loss structure, this principle provides "best" 
estimates for these parameters. Two attempts to apply 
Bayes analysis when the t ime-to-failure distribution is 
Weibull are given by Soland [30] and Harris and 
Singpurivalla [11]. In each of these papers various 
forms for the prior distribution are considered. 

The Bayes method for parameter estimation is 
difficult to apply in many situations. For "best" results 
it requires a completely known and specified prior distri- 
bution, which is seldom available. In such situations an 
empirical Bayes approach can be utilized. This approach 
acknowledges the existence of a prior distribution; 
however, this distribution need never be explicitly known 
to the researcher. Estimates of the parameters obtained 
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in previous experiments can be used to improve the esti- 
mates in the present experiment. This approach will be 
discussed in detail in the next section. 

The main purpose of this dissertation is to develop 
an empirical Bayes estimator that is capable of providing 
significant improvement over both the classical and 
presently known empirical Bayes estimators. The estimator 
will then be used to obtain point estimates for the unknown 
parameters in the two-parameter Weibull distribution. The 
particular form of the distribution considered has the 
following probability density function: 

f(x) = a3x^ -1 e -ax (x > 0; a, 3 > 0) (1.1) 

where a denotes the scale parameter and 3 the shape 
parameter. Figures 1 and 2 illustrate the influence these 
parameters have on the Weibull density function (1.1). 

In Figure 1, a = 1 and plots of f(x) in (1.1) are given 
for various values of 3 . We remark that for 3=1, 
f(x) plots as a one-parameter exponential density function 
with the parameter equal to one. Thus, when 3=1, 

(1.1) reduces to the well-known exponential density 
function with parameter a . In Figure 2, 3=3 and 

plots of f(x) from (1.1) are given for several values 


of a . 
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x VALUES 

Figure 1. — Weibull density function (a = 1) . 
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x VALUES 

Figure 2. — Weibull density function (3 = 3). 
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1. 4 Bayes and Empirical Bayes Estimation 

Throughout this dissertation, upper-case Greek or 
Roman letters will identify random variables. Lower-case 
letters will be reserved for realizations of the same. 

In some cases ease of notation may cause violation of this 
convention; however, in such instances proper meaning will 
be clear. 

To begin let us define the basic-decision theoretic 
elements on which both the empirical Bayes and the Bayes 
approach are based. They are as follows: 

(i) There is a parameter space 0 with generic 
s-vector 0 = ( 0 _ , 0„, •**, 0 ) on which is 
defined a probability distribution G referred 
to as a prior distribution. 

(ii) There is a decision space D , which coincides 
with 0 for estimation, with generic 
element 6 . 

(iii) There is a loss function it ( 6 , 0) > 0 repre- 

/■**/ 

senting the loss incurred when £ is taken as 
an estimate for 0 . 

(iv) There is an observable random k-vector 

x = (x_, x„, •••, x ) , distributed on a space 
X on which is defined a a-finite measure p . 
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When the parameter is _0, X has a density 
f(x| 0) with respect to y . 

If the decision function £ is chosen and the sample 
vector is observed, then <5(x) is taken as an estimate 
of 0 and the loss £(6(x), 0) is incurred. For any such 
decision 6 the expected loss when 0 is the true 
parameter is given by 

R(£> 6) = /*(&(*), e)f(x|0) dy(x) . ( 1 . 2 ) 

X 

Hence, the global or overall risk can be represented 
as 


G> = /r(£, e) dG(e) 


(1.3) 


where G is the prior distribution of 0 . If the prior 
density corresponding to G(jB) is denoted by g(£), this risk 
can be expressed as 


R(6, G) 



f Q 


£)h(0|x) d0 dx 


where 


f(x|^) g(6) 

ran 


h(e|x) 


f(x, e) 


(1.4 ) 
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and f(x) is the marginal density of X . Therefore, to 
minimize the risk R( <5 , G), a decision function 6(x) should 
be chosen such that 

x) = i i(fi(x), e)h(e|x) de (1.5) 

is minimum. If such a decision function, say 6 (x), exists 
it is known as the Bayes decision function and 

R(G) = R(6, qJ G) - min ( r(£, G)) (1.6) 

6 

is known as the Bayes risk. 

When G is known, the optimal decision j5 can be 
determined. If G is unknown, however, this minimum risk 
decision cannot be obtained. In the empirical Bayes 
approach complete determination and specification of the 
prior distribution is unnecessary. Instead, it is assumed 
that the decision problem given above has occurred repeat- 
edly and independently with the same unknown prior 
distribution throughout. Thus, there exists a sequence 

(X , 0 ), (X , 0_), •••, (X , 0 ) (1.7) 

of independent pairs of random vectors (X, 0) , where 

X.(i = l,2, 4 **,n) has dimension k and 0.(1 = l,2,*»*,n) 

''"■'1 '^'1 
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has dimension s . At the time an estimate of 0 for 
the nth realization or present experiment is to be deter- 
mined, the researcher has at his disposal the vector- 
valued sequence x_, x„, •••, x . The s-vector 0 
remains , of course, unknown. This information can be 
used to provide a decision function of x n based upon 

x, , x„, •••, x . like 
~1 3 ~2 ’ ~n - 1 


<5 (x. , 

~n ~_L 


such that when x is observed, 6 eD is taken as an 

~n 3 ~n 

estimate of 0 and the loss a(s (x ), 0 ) incurred. 

Such a decision function will be called an empirical 
Bayes decision function. Hence, when G is unknown to 
the researcher, he is able to extract some information 
about the prior distribution through the sequence 
~l 5 ~2 5 ***> an< ^ obtain an approximate decision func- 

tion 6 (x ) to the Bayes decision function 6 (x). 

~n ~n J ~g ~ 

1 . 5 Squared Error Loss 

Consider the squared error loss function 

2 

^(5i ( x), 0.) = ( 6 i^)- 0 i) d-9) 

for the 0. component of 0 . If we represent ct) (6., x) 
defined by (1.5) as E^S^x), 6 i )|x and replace 
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& ($1 (x) , 8 ^ by (1.9) then (1.5) can be written as 


E 




( 1 . 10 ) 


After adding and subtracting appropriate quantities and 
simplifying, we have 


E 




(6 i (x) - E(0 i jx)) 2 + Var(6 ± |x) 


( 1 . 11 ) 


which will be the minimum when 


6.(x) = E(6.|x) . (1.12) 

The Bayes estimator for each 0 i (i = 1, 2, •••, s) is 
therefore given by (1.12), and from (1.6) the Bayes risk 
becomes E [var ( 0^. | x) J . In general, the Bayes estimator for 
0 is given by 


6(x) = E( 0 | x ) . (1.13) 

1 . 6 Bayes Estimate for a Sufficient Statistic 

Consider an arbitrary random sample of size k denoted 
by x = (x lS x 2 , •*•, x R ) from a univariate distribution 
with density function f(x|0). If there exists a set of 



sufficient statistics t = (t.» t„, t ) for 

£ = ( 0^, 0 2 , ••• j 0 g ), then by the Neyman factorization 
criterion [23] the likelihood function 

k 

L(x | 0) = TT f (x. | 6) (1.14) 

1=1 

factors into 

L(x | 0 ) = f ( t | 0 ) c ( x ) (1.15) 

where f(t|£) represents the conditional density function 
of t and c(x) is some function of x not involving 0 . 

/•X* 'Xrf ^x* 

If the prior density function is denoted by g(j9) 3 the 
joint density function of x and 0 can be written 

f(x, 6) = L(x|0) g(0) , 

and by (1.15) this becomes 

0) = f(t|0) g(0) c(x) 

= f (t , £) c(x) . (1.16) 

If ( 1. 16 ) is integrated over the region 0 , the marginal 
density of X becomes 
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f(x) = p(t)c(x) 


(1.17) 


and by dividing (1.17) into (1.16), we obtain 


h(e x) = 


p(t) 


= f ( 0 t) 


Hence, the Bayes estimator for 0 becomes 

E(e|x) = E(elt) (1.18) 


and the empirical Bayes estimator can be based on 
t = (t_, t_, t ) rather than X = (x,. x„ , • • • , x ) . 

Thus, the sequence (1.7) which represents past experience 
can be written as 

(Si. SP. <E 2 . g 2 ) "• < 2 n > £ n > d.19) 

with each vector pair (T, 0) distributed identically and 
independently with probability density f(tj£)g(£) . 

1 . 7 Maximum-Likelihood Estimation 

In this section several properties of maximum- 
likelihood estimation are presented for completeness. 

These properties will be needed in the remaining chapters 
and can essentially be found in Kendall and Stuart [16], 
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Throughout this section the reader is reminded that the 
parameter £ , specifying the particular member of the 
family of distributions under consideration, is not assumed 
to be a random variable. Thus, notation such as f(x|£) 
merely exemplifies the dependence of the density on the 
parameter £ and should not be confused with the analagous 
conditional probability statement. This notation is 
followed, since in the remaining sections, £ is assumed 
to be a random variable. 

The maximum- likelihood estimate of 0 is defined as 

/V 

that value, say , within the range of 0 which max- 

imizes the likelihood function (1.1*0. The subscript k 
is used to denote the dependence of the estimate on the 
sample size. If the likelihood function L is a twice 
differentiable function of 0 throughout its range and if 
stationary values of L(x|0) exist, then the maximum- 

/-s-> 

likelihood estimates of 0^ 0 , ••*, 0 g can be found by 
solution of the system of equations 


3L( x I 0) 
_ 

i 


= 0 , (i = 1,2, • • • ,s) , (1.20) 


for 0 0 • • • ft 

1 °r W k,l 3 °k , 2 5 1 k , : 


In practice it is often simpler to work with the 
logarithm of the likelihood function rather than the 
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function itself. Since L(x|0) and log L(x|-0) have their 
maximum at the same value of 0e0. Thus, maximum- 
likelihood estimates can be found by the solution of the 
system of equations 

3 log L(x| 0) 

3 e ~ ~ = 0 , (i = 1,2, • • • ,s) . (1.21) 

i 

When s = 1 , (1.21) reduces simply to 


d log L(x 0) 
dlT^ 


0 . 


( 1 . 22 ) 


Consider the univariate density function f ( x I © ) . If 
a sufficient statistic £ exists for the parameter vector 
£ it is readily seen that the maximum-likelihood estimator 

A 

£ k , must be a function of it. This is true since the 
sufficiency of t for £ implies the factorization given 

✓N 

by (1.15). Choosing £ k to maximize the likelihood 

A 

function is thus equivalent to choosing to maximize 

A 

f ( 1 1 0 ) , and hence 0. is a function of t alone. If 

A 

0 k represents a one-to-one transformation on t , then 

A 

0, will be sufficient for 0 . 

~k ~ 

In a proof given by Wald [32], the MLE is shown to be 
consistent under quite general conditions. A simplified 
version of this proof is available in [ 16 ] . The generality 
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of the conditions becomes clear by the absence of any 
regularity constraints on the density function f(x|j)). 
These conditions require the existence of certain inte- 
grals which are generally satisfied by most distributions 
and, in particular, by the various forms of the Weibull 
distribution. 

Consider the case where s = 1 . If the first two 
derivatives of the likelihood function with respect to 0 
exist, if 


o log L(x | 0 ) 
90 


= 0 


(1.23) 


and if 


R 2 ( 6 ) 


= -E 


log L(x| 0) 
30 2 / 


= E 


log L( x | 0 ) 

3 Tr“ 


(1.24) 


exists and is nonvanishing for all 0e0 , then the max- 

A 

imum-likelihood estimator 0 k can be shown [ 1 6 ] to be 
asymptotically normally distributed with mean 0 and 
variance 1/R (0), that is. 


distr. ( 0 R 



R 2 (0) 


0 ) 


(1.25) 
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It will be constructive to show under what conditions 
the assumptions (1.23) and (1.24) are valid. In this 
regard consider 


E 


/ 8 log L(x | 0 ) \ _ 1/ 1 3L (x|0)\ r/ 

\ 36 / J ^L(x | 0) 36 J M 


x I 0 ) dx 


X 

/"V 

" 


( 1 . 26 ) 


36 


L(x 0) dx 


If differentiation and integration can be interchanged, 
then (1.26) becomes 


El 


log L(x|e) 
36 


3 

30 


\: 


L( x I 0 ) dx = 0 


since 


J L (x|0) dx = 1 

X 


(1.27) 


Differentiating (1.26) again, we obtain 
11/ 1 3L(x|0) ^ 3L(x | 0 ) 

•'x 


L(x I 0 ) 39 


30 


+ L(x|0) ae ^(x| 0) 30 L (x|0)^ 


dx = 0 
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which becomes 


'X 


1 3 L ( x j 9 ) \ 3 2 log L( x | 0) 


, L ( x 0) 5T 


3 0 ‘ 


L(x| 0) dx 


or 


'3 log L( x ) 0 ) 


= -E 


log L( X I 0) 
3 0 2 


( 1 . 28 ) 


Thus, the only condition necessary to establish assumptions 
(1.23) and (1.24) when the first two derivatives of the 
likelihood function exist is the ability to interchange 
integration and differentiation. See, for example, 

Cramer [5]. 


For the general case s > 1, analogous arguments to 

Vs 

those above (see [l6]) verify that 0 k is asymptotically 
distributed as a multivariate normal distribution with 
mean vector 0 and covariance matrix V whose inverse 
V -1 has elements given by 




1 

log L( xj _0 ) 

36.30. 

1 D 


(1.29) 


1 . 8 Outline of Succeeding Chapters 


In Chapter II a continuously smooth empirical Bayes 


estimator is developed. This estimator is obtained by 
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replacing the continuous prior density In the Bayes 
estimator by a suitable approximation. The approximation 
is based on a sequence of consistent estimates and is 
shown to converge in probability to the prior density as 
both the number of past experiments n and the sample 
size k tend to infinity. Since both n and k are 
finite for practical application, the mean and variance 
of the marginal distribution of these estimates may not 
coincide with those of the prior distribution. Therefore, 
a method for transforming these estimates into a new 
sequence of values having a marginal distribution whose 
mean and variance approximate those of the prior distri- 
bution is illustrated. This new sequence of values is 
then used to obtain an alternative approximation to the 
prior density. 

In Chapter III smooth empirical Bayes estimates are 
obtained for the discrete Poisson distribution. This 
chapter has been included in order to exemplify the 
versatility of the smooth estimation procedure. The 
distribution has received considerable attention from 
empirical Bayes authors and can be used to provide a 
common mode for comparison with other well known and 
proven empirical Bayes estimators. Two such methods are 
used for comparison in Chapter VII. 
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In Chapters IV, V, and VI smooth empirical Bayes 
estimation is applied to the Weibull distribution. 

Chapter IV provides smooth empirical Bayes estimators for 
the scale parameter a when the shape parameter 3 is 
known. Chapter V considers the reverse situation, and 
Chapter VI provides smooth estimators when both parameters 
are known to vary unpredict ab ly . In each of these chapters, 
results from Monte Carlo simulation show that even for 
small sample sizes and few past experiments, the smooth 
estimators have smaller, mean-squared errors than the 
classical, maximum- likelihood estimators. 

In Chapter VII empirical Bayes methods for point 
estimation developed by Rutherford and Krutchkoff [28] 
and Lemon and Krutchkoff [17] are outlined. Where 
applicable, these methods are applied to the distributions 
of the preceding chapters, and Monte Carlo simulations 
are performed. The results from these simulations are 
then directly compared with the results obtained by using 
the continuously smooth empirical Bayes estimators. 

A summary of the conclusions derived from this 
research is presented in Chapter VIII,. and areas recom- 
mended for future research are also discussed. 



CHAPTER II 


A CONTINUOUSLY SMOOTH EMPIRICAL BAYES ESTIMATOR 

In this chapter a continuously smooth empirical Bayes 
estimator is developed. The estimator is obtained by a 
continuous approximation to the prior density function 
and offers a new approach for obtaining empirical Bayes 
point estimates. 

2 . 1 Notation and Preliminaries 

Throughout this dissertation the term consistent 

A 

estimator will frequently be employed. To say that 0 

.K. 

is a consistent estimator for 0 will imply that for any 
positive numbers e and 6 , however smallj there exists 
an integer K such that for k > K 

Pr( 1 0. - 0| < e) > 1 - 6 . 

This will often be referred to as convergence in probability 
and for notational convenience will be written as 

A 

p lim 0. = 0 

k->°° 

Assume that an unobservable random parameter 
0 = (0,j 0_, • • • j 0 ) occurs according to the distribution 

A*/ 1 2 S 

G(0_) with corresponding density function g(£). When £ 

23 
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is realized, an observable random k-vector X occurs 
according to the conditional distribution F(x|£). When 
a sufficient estimate 0. of 6 exists, the Bayes esti- 

mator E(0.|x) for 0. , the j th component of 0 , is given 

3 ~ 3 ~ 

by 



I 


n 0 . f (0. 0)g(0) d0 

t) j ^ 


k 


neje)g(e) de 


( 2 . 1 ) 


where f(0. |0) denotes the conditional density function of 
the sufficient estimator 0 k given £ . Any such function 
f(*|*) will be referred to as the kernel of integration. 


When the prior density g(£) is unknown, the Bayes 
estimator is unattainable. However, if the situation 
described above occurs repeatedly with the same, but 
unknown g(0), then a sequence of n sufficient estimates 


✓v /\ /\ 



( 2 . 2 ) 


obtained from previous replications can be used to construct 
a continuous approximation to the prior density function. 
Based on this approximation an empirical Bayes estimate of 
0 , the nth realization from g(0), can be determined. 
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2 . 2 Marginal Density Approximation 


The marginal density function of 0 is given by 


- f f(0 k l.e>e(&) de . 

J Q 


( 2 . 3 ) 


When g(£) is unknown, p(£ k ) cannot be determined; however, 

consistent density estimators are available which can be 

/\ 

used to approximate p (£ ) . These estimators can be con- 
structed using sequence (2.2). They take the form 


' nh® 


n 

E 

i- 1 


Wl 


0 ,- 0 , . 

~k , l 

“h 


(2.4) 


where h is a function of n satisfying 


lim h (n) = 0 


(2.5) 


and 


limnh s (n) = 00 . (2.6) 

n -*oo 


Acceptable forms for the function W are given by 
Parzen [24] for univariate densities, s = 1 , and by 
Martz [19] for multivariate densities. In particular 
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for estimating a univariate density function, the estimator 


P ( 0 J 

k 



where 


(2.7) 


h 


n 


- 1/5 


( 2 . 8 ) 


will be used. This particular estimator has been demon- 
strated to possess certain desirable properties (see 
Clemmer and Krutchkoff [3] and Martz and Krutchkoff [20]. 

In practice, the estimates 0, (i = l,2,***,n) are 

~k ,i 

unitized quantities, and it becomes necessary to multiply 
h by an appropriate function to remove these units of 
measurement from the argument of W in (2.4). For example 
in (2.7), this may be accomplished by defining h to be 


n 


1 5X.i-v a 


i = l 


h 


n 


1/5 


n 


(2.9) 
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whe re 




i=l 


( 2 . 10 ) 


A 

The estimators p n (j0 ) are squared error consistent for 

A 

estimating the density pCj) ) In the sense that 


11m E 

n -*oo 


P ( 0 V ) - p ( 6. ) 
n ~k ^ ~k 


1 2 


= 0 


( 2 . 11 ) 


for all © k In the continuity set of p(»). This conver- 
gence Implies a different Interpretation of squared error 
consistency. Usually, It Is the sample size k which 
tends to infinity and not the number of experiences n . 
While increasing sample size is conceptual, in the empirical 
Bayes situation the number of experiences may actually grow 
without bound. Thus in application, the convergence given 
by (2.11) represents a natural result. 

2 . 3 Prior Density Approximation 

In this section the marginal density estimator 

A 

P n (£ k ) will be shown to converge in probability to the prior 
density function g(i9), as both the number of experiences 
n and the sample size k tend toward infinity. The 
following theorem, which is a slightly modified version of 
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a theorem found in Rao [28], will be needed to establish 
the main result. 

Theorem 2.1 - If p (6, ) is a continuous density estimator 

/s 

and 0 is a consistent estimator for 0 , then 

~k ~ 

P llm P n ( ~k' ) = p n^ ) ' (2.12) 

k-voo 

Proof — Since 0^ is consistent for 0 , given any pos- 
itive numbers y and n , an integer K exists such that 
for k > K Pr( | £ k - ©| < y) > 1 - n/2 . Now let I be a 
finite region such that Pr(0 e I) = 1 - n/2 . Then since 

p is continuous. Ip (6,) -p (6)| < e , for any arbi- 
t n 5 1 *n ~k r n ~ 1 5 

trarily chosen e > 0 if |© - 0| < y for 0 e I . Hence, 

Pr (|p n ( ~k ) " < e ) ^ Pr (li k - Si < y > £ G I ) 

> Pr (l i, k " ©I < y) - p r(e 4 X ) 

> l - n 

for k > K . 

Theorem 2 . 2 — If the conditions of Theorem 2.1 are satisfied 
and since 

p lim p (0) = g(0) (2.13) 

n ->°° 
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then 

p lim p (0 ) = g(0) . 

n ~ k ~ (2.14) 

k^-oo 

Proof — By Theorem 2.1, given any positive constants y 
and e , there exists an integer K such that for k > K 

Pr (lPn ( V - Pn ( §->! < I) i (1 - 1/2 S 

and since p (0) is consistent for g(0), there exists an 

n ~ ~ 

integer N such that for n > N 

Pr 0 P r/~^ " S(£)| < f) - (1 “ Y) 1/2 • 


Now 


W " * l p n ( 4 } " P n ( & } l + l p n ( £ } “ S^l 


Thus for n > Z and k > Z where Z = max(N,K) , 

Pr (l p n ( £ k ) ' S ( 2. ) l < £ ) 

2 Pr ((|p„'V - P„(£>l + lp„(e> - see.) |) < e) 
a p p(|p n (£ k ) - p n (9)| < f , |p n (e) - g (6)| < |) 
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The events 


p n ( V - 


p (0) 
n ~ 




(2.15) 


and 


|p n (e> - g(e)| < f (2-16) 

are Independent since the occurrence of (2.15) depends 

/s 

only on the kernel of integration f(^ k |£), and the 
occurrence of ( 2 . 16 ) depends only on the prior density 
g(0_) . These densities are clearly independent; hence 


Pr ( 

>A> 

- g(e>l 

< e) 


> Pr ( 

le„ ( V 

- p„<£> 

I < |) Pr (ip n <«) ■ 

- g(0) 1 < | 

> (1 ■ 

- y) 1/2 

(1 - Y) 

1/2 - l-Y • 



2 . 4 Continuously Smooth Estimators 

The limiting result in (2.14) is somewhat artificial 
in practical applications since small values of k and n 
are usually encountered. Nevertheless, this property does 
suggest the replacement of the prior density in the Bayes 
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estimator (2.1) by p (0. ) when considered as a function of 

J n ~k 

0 . The continuously smooth empirical Bayes estimator for 
0^ , the jth component of , becomes 


I, 


0 


' 0 .f (0. | 9 )p (6) d0 

0 ] ~k,n‘~ *n ~ ~ 

* f(e v |e)p*(e) d© 

~k,n'~ *n ~ ~ 

•/ 0 


J = 


(2.17) 


where 


* 


P 


n 


( 9 ) 



( 2 . 18 ) 


For simplicity we have not indexed © D with a subscript 
j . Denote the vector having components given by (2.17) 
as 0^ . In particular for s = 1, (2.17) takes the form 

'"-'D 



(2.19) 
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where h is piven by (2.8). The subscript D is a 
notational convenience and denotes the particular distri- 
bution of the estimator 0. when given 0 . For example, 
if the kernel of integration is normal then (2.17) would 
be represented as 0 N . 

In practice the actual range of 0 will generally be 
unknown. This can be resolved satisfactorily by taking 
the region of integration in (2.17) to be the observed 
range of previous estimates. Thus it is necessary only to 
order successively the sequence of estimates given in (2.2), 
from which the range is easily calculated. The success of 
this approximation will be demonstrated in the remaining 
chapters . 

Occasionally for certain families of distributions, 

no sufficient statistic exists for estimating 0 . In 

/\ 

such cases some statistic 0 may exist which can be 

***** 

used in the formulation of 0 D . This, of course, repre- 
sents a further degree of approximation since 
E(9 x) f E ( 9 9, ) . The estimator 0„ , however, may 
continue to produce "good" results. This situation will 
occur in Chapter V and Chapter VI, and the continued success 
of 0 will be witnessed. 

The ability of the estimator © D to provide "good" 
results depends on the accuracy of several approximations. 
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the most significant being the accuracy with which the 
density estimator p (£) represents the prior density 
g(£) . If 0 D provides "better" estimates for the previous 
realizations of 0, than the elements of sequence (2.2), 
then iteration of Gj may achieve still further improve- 
ment. In each iteration the density estimator for the 
prior density function is based on the sequence of 
previous estimates £ . (i = 2,3,***,n) . When i = 1 , 

e3 , , is defined to be 8, . . Results from such 
iterations as well as further discussion are reported in 
the remaining chapters . 

2 . 5 Marginal Variance Correction 

In the construction of the continuously smooth empiri- 
cal Bayes estimator 0 D , it has been suggested that a 
consistent density estimator be used to represent the prior 
density. It would, therefore, seem desirable to base this 
density estimator on a sequence of values 


# * * 

A A • « • A 

~k , 1 ’ ~k , 2 * 5 ~k , n 


( 2 . 20 ) 


whose marginal distribution has mean and variance approx- 
imately equal to those of the prior density. 

For finite samples of size k , the mean and variance 

A 

of the marginal distribution of 0 k are generally not 
equivalent to those of the prior distribution. A linear 
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transformation, however, can be constructed on the ele- 
ments of sequence (2.2) to provide a new sequence of 
values having a mean and variance which are approximately 
equal to the prior mean and variance. To illustrate this 
procedure when s = 1 , consider the kernel of inte- 
gration to be 


/X 

distr . ( 0 | 0 ) 

K 



( 2 . 21 ) 


where c is a known constant and k is the sample size. 
This procedure is quite general, and the normal distribu- 
tion is only used for purposes of illustration. If the 
relations of conditional probability 
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and 


Var(9 k ) 


E fe + Var < 9 > 


^ E(6 2 ) + Var(9) 


= ^ (var ( 6) + E 2 (0)) + Var(0) 


Var(e) 4 E 2 (e) . 


(2.24) 


respectively. Thus" for' finite k , the mean of 0 k 
unconditional on 0 will be equal to the prior mean, but 

y\ 

the variance of © k overestimates the prior variance by 
the amount 


~ jVar ( 0 ) + E 2 ( 0 ) 


In the light of this observation, consider the trans- 

✓V 

formation of 0 k given by 


= &1 ( 0 k - E(0 k )) + E ( 0 ) 


(2.25) 


where a 1 is a constant, to be determined. . Now 


E(9*) 


= a E(0 ) - a x E(0 ) + E ( 0 ) 


E ( 0 ) 
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and 


Var ( 0 k ) 


Var (a 1 0 k ) 


= a^ Var(9 k ) . 


( 2 . 26 ) 


Substituting (2.24) into (2.26) we obtain 


Var(e k ) 


= a... 


(li|£) Var(9 ) + i E 2 ( 0 ) 


a. 


( 1+kc ) Var (6) + E^(0) 
kc 


(2.27) 


Hence by defining 


a. 


kc Var(0) 


1/2 


(1+kc) Var(0) + E^(0) 


( 2 . 28 ) 


we obtain 


Var(0 k ) = Var(0) 


the desired result. 

If the mean and variance of the prior distribution 
are known., then a x can be exactly determined. In 
practice, however, these quantities will generally remain 
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unknown; hence estimates for these quantities must be 

A 

obtained. Since E(6) '= E(0 fc ) > the prior mean can be 
estimated by the sample' mean 


6 

n 


V k>i 

/ g n 
i-1 


(2.29) 


A 

Estimating Var(0 k ) by the sample variance 


2 

s 

n 



( 2 . 30 ) 


and substituting (2.29) and (2.30) into (2.24), we have 


Solving 


2 _ 1+kc 

S n " Hkc - 


(2.31) for Var ( 0 ) 


Var(S) + K 9 n • 
, we obtain 


Var(0) 


kc „ 2 
1+kc n 


1 

1+kc 



n 


(2.31) 


(2.32) 


as an estimate of the prior variance. Therefore when both 

the mean and variance of the prior distribution are unknown, 
_ /\ 

0 and Var(0) can be substituted for E(0) and Var(0) in 
n 

(2.28) giving 
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a 


l 



(2.33) 


The transformation 0 k defined by (2.25) is suffi- 
cient for estimating 0 since it is obtained by a one-to- 
one transformation on the sufficient statistic 0 . To 

show that it is also consistent, consider 


* 

p lim 0 k = p lim 

k-voo k-* 30 

Now it is easily shown that both representations for 
a^^ , (2.28) and (2.33), have the property that 

lim a = 1 . 

k-*» 


a] 0 k - a 1 E(6 k ) + E(0) 


(2.34) 


Thus (2.34) becomes 


p lim 0* = p lim 0=0 (2.35) 


* 

and © k is therefore a consistent estimator for 0 . 



Hence by Theorems 2.1 and 2.2, a density estimator based 
on sequence (2.20) has the property that 



CHAPTER III 


ESTIMATION IN THE POISSON DISTRIBUTION 

The Poisson distribution has played a significant 
role in the development of empirical Bayes techniques. 
Robbins [26] first introduced the empirical Bayes approach 
with the Poisson distribution. Recently, new empirical 
Bayes methods have been illustrated using this distribu- 
tion. In keeping with the historical significance of the 
Poisson distribution, the usefulness of the continuously 
smooth empirical Bayes method will first be illustrated 
with this distribution. 

Maximum-likelihood estimation is chosen as the 
classical method of estimation, and results from Monte 
Carlo simulations are reported, which show that the smooth 
estimators have smaller mean-squared errors than the 
maximum-likelihood estimators. These results are reported 
for small sample sizes and few past experiences. 

3 . 1 Maximum-Likelihood Estimation 

Assume that the conditional distribution of X , 
given any value 8 > 0 , is Poisson with probability mass 
function 

, e“ 0 fl x 

f ( x | 6 ) = yT (x = 9 > 0) . (3.1) 
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4l 


Consider a random sample of k observations from 
(3.1). The likelihood function of this sample is 


L(x| 0) 

/v 


k 

z>i 

e^V - 1 



TT x i ! 

i=l 


(3.2) 


Since the Poisson distribution satisfies the necessary 
regularity conditions given; in section 1.7, the maximum- 
likelihood estimator for 0 can be found by the solution 
of equation (1.22). Therefore taking the logarithm of 
(3.2) and differentiating with respect to 0 , we have 


d log L(x|6) 
d6 ~ 


= -k + 0 


rv 


i=l 


(3.3) 


Equating (3.3) to zero and solving for 0 , we obtain the 
maximum-likelihood estimator 



i=l 


(3.4) 


Observation of equation (3.2) reveals that the likeli- 
hood function can be expressed as the product 


L (x | 0) 


Q ( 1 1 0 ) C ( x) 
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where 

C(x) = 

/v 



i=l 


and 

q (t | e ) = eV ke . 

Thus by the factorization criterion [23] 


k 



1=1 

is a sufficient statistic for 0 . The maximum-likelihood 

/N 

estimator 0 k can clearly be obtained by a one-to-one 
transformation on T and therefore represents a sufficient 
estimator for 0 . 

It is evident that given any 0 > 0 , the only 

-A. 

possible values for 0 are 0, 1/k, 2/k, ••• . For a 

JC 

particular value t/k , the values x = (x. ,x_ , • • • ,x. ) 
must be such that Ex, = t ; hence the probability 

A 

f(t/k|0), that 0 fc takes on the values t/k , is obtained 
by summing (3.2) over all sets x so that X)x. = t . 

That is, 
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-k0 _ t 
e 6 


TTv 

i=l 


-k0 Q 
e 0 


A TT x i 


1=1 


( 3 . 5 ) 


where A represents all possible sets x such that 

Sx. = t . The multinomial theorem states that 
x 


i x , 

k i 


(y x + y 2 + ••• + y k )* - J/'TT 5 TT * 


( 3 . 6 ) 


i=l 


therefore setting each y i = 1 , we have 



A TTv 


i=l 

and from (3.5) 



e“ k0 (6k) t 
t ! ’ 



( 3 . 7 ) 
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Writing (3.7) explicitly as a function of the maximum- 

/\ 

likelihood estimator 0 k , we obtain 

. „ k0 v 

a _ k 0 / 1 q \ k 

f (6. | e) = 2 (3.8) 

k (k0 k )! 

Since there is a one-to-one correspondence between 
0 = T/k and T = ]Cx. , the probability mass function 

of T is 

f (t | 6) = 6 k ^ k9)t , (t = 0 , 1 , 2 , • • • ) . (3.9) 

Hence 

f(e k |e) = f(t|e) ( 3 . 10 ) 

which is a Poisson mass function with mean and variance 
given by k0 . 

3 . 2 Smooth Empirical Bayes Estimators for 0 

Assume that an unobservable random parameter 0 
occurs according to the unknown density function g(0). 

When 0 is realized, an observable random vector X 
from (3.1) occurs, and a maximum-likelihood estimate of 
0 is formed using (3.4). Now if this situation occurs 
repeatedly, then the sequence 


A. 



, 1 3 



,n 


5 


(3.11) 
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of maximum-likelihood estimates can be used to form a 

A 

marginal density approximation p (0 ), given by (2.7), 

n jc 

/V A 

for p(0 ). Convergence in probability of p ( 0 ) to the 

JC XI JC 

unknown prior density function g(0) is assured by 
Theorem 2.2. Therefore when (3*8) is used as the kernel 

A 

of integration, p n (0 fc ) can t> e treated as a function of; 

0 , and a continuously smooth empirical Bayes estimator 
for © n , the nth realization from g(0), becomes 


n /*®k,(n 

z l 

1=1 \,ll] 


0e~ k6 0 k9k ' n 


0. 


0-0 


sin 


k ,i 


2h 


0-0 


k_, i 


2h 


d0 


n f 6 k , (n] 
1=1 \.i» 


— k0 fl k ^k , n 
e 0 ' 


/ e-e. . 

Sin \~2h'~ / 


0-0 


k , i 


2h 


d0 


( 3 . 12 ) 


The limits of integration in (3.12) 0. and 

^ f l J.) 

A 

©k (n) are res P ect i ve minimum and maximum values of 

sequence .( 3. 11) . 


3 . 3 A Smooth Empirical Bayes Estimator for 0 Corrected 
for Variance 

Although 


P 


lim 

n-*» 

k-*» 


W 


J 


g(0) 
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in practical application both n and k will remain 

a 

finite. The marginal density estimator p (0 ) may 
therefore represent a poor approximation to the prior 
density. As demonstrated in section 2.5 , when the mean 

A 

and/or the variance of the marginal distribution of 0 k 
are not equivalent to those of the prior distribution, 
the prior density approximation can often be improved. 

Performing the transformation 

0* = a l(®k ~ E( V) + E(0) (3.13) 


on each of the maximum-likelihood estimates of sequence 
(3.11), we obtain a sequence of values 


0 


* 

k , 


1 * 




(3.14) 


The marginal distribution of this sequence has a mean 
and a variance equivalent to those of the prior distri- 
bution. Based on this sequence, an approximation 

p (0 ) to the marginal density p(0 ) can be formed. 
nK k 

When considered as a function of 0 , p (0, ) can be 

5 n k 

used to represent the prior density in the Bayes estimator 

I A. 

0, ) . Thus an alternative smooth estimate of 0 , 

k n 5 

the present realization from g(0), can be given. 
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If the relations of conditional probability (2.22) 
and (2.23) are used, then the mean and the variance for 
the marginal distribution of the maximum-likelihood 
estimator become 


E(0 ) = E(k9) = kE(9) 

JC 


(3.15) 


and 


Var(6 k ) = E(ke) + Var(k0) 

= kE(9) + k 2 Var(0) 


( 3 . 16 ) 


respectively. Since the mean and variance of the 

A 

marginal distribution of 0 k overestimate the mean 
and variance of the prior distribution, the transfor- 
mation given by (3.13) can be applied. 

To determine the constant a^^ in the transformation 

£ 

(3.13) so that Var(0 k ) = Var(0) , consider the variance 

# 

of 0, : 

k 

Var(0 k ) = Var ( a x 0 fc ) = a 2 Var(0 k ) . 


(3-17) 
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Substituting (3.16) for Var(0 ) in (3.17), we have 

K 


Var(0*) = a 2 (kE(0) + k 2 Var(0) 

K. X 


( 3 . 18 ) 


Thus choosing 


Var ( 9 ) 


-il/2 


kE( 6 ) + k Var (6) 


(3.19) 


we obtain the desired result. 


When the mean and the variance of 0 are known, 
a.^ can be exactly determined by (3.19). In practice 
however, these values are generally unknown and require 
estimation. Since E(0 fc ) = kE(0) , the prior mean 
can be estimated by 0^/k where is the sample 

mean given by (2.29). The sample variance s n given by 
(2.30) can be used to approximate Var(0 ). Proper 

K. 

substitution of these quantities into (3.16) gives 

/N S 2 - 0 

Var (6) = — — Ji (3.20) 

k 2 

as an estimate of the prior variance. Thus when E(0) 
and Var(0) are unknown the constant a^ can be approxi- 
mated by 



(3.21) 
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With the transformation 0* completely determined, 
the marginal density approximation p n (0 fc ) for P ( 8 k ) can 
be formed. Substituting p (0 fc ), when considered as a 
function of 0 , for g(0) in the Bayes estimator 
E ( 0 1 0 ), we obtain as an alternative smooth estimator 
for 0 

n 



( 3 . 22 ) 

where 0 fc ^ and 0 k ^ are the respective minimum 
and maximum values of sequence (3.14), and h is given 
by (2.8). 

3 . 4 Iteration of the Smooth Estimators 

Consider a sequence of smooth estimates from (3.12), 




P,2 


5 


P,n 


(3.23) 
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obtained in each of n previous experiences . Based on 
this sequence, a more "precise" estimate of the reali- 
zation 0 can often be determined. The estimate is 
n 

obtained by replacing the prior density in the Bayes 
estimator by the continuous density approximation 


p (9) 

n 


i 

2imh 


E 

T =1 


sin 




(3.24) 


constructed with sequence (3.23). This represents an 
iteration of the smooth estimator 0 p and is denoted 

^ t 

by 0 p since the kernel of integration remains unchanged. 
A similar iteration of the estimator 0 p v can be 

~ f 


obtained and will be denoted by 0 p v . Integration in 
each estimator is performed over the range of values 
obtained from the first iteration. For example, the 
region of integration for 0 is from min(0 .) to 

P P r 1 

max ( 0 .) where i,j = l,2,***,n and i =f= j . The 

P / J 

smooth estimator 0 p is defined for n > 1 ; therefore 

~ * 

for i = 1 , we define 6 n = • . 

P / J- K / 1 


3 . 5 Monte Carlo Simulation 

To ascertain the usefulness of the continuously 
smooth empirical Bayes estimator 0 Q as opposed to the 
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maximum-likelihood estimator, Monte Carlo simulation 
was employed by means of a UNIVAC 1108 computer. The 
criterion for comparison was mean-squared error, and 
therefore the ratio 


R = 


empirical Bayes mean-squared error 
maximum-likelihood mean-squared error 


(3.25) 


was of interest. Here the symbol 0 D is used as a 
general notational device to represent any smooth 
estimator under consideration. 

A value of 0 was generated from a chosen prior 
distribution; then a random sample x , x , ••*, x 

j. A K 

of size k , corresponding to the realization of 0 , 
was obtained from (3.1). The maximum-likelihood 
estimate 0 k was found and its squared deviation 
(0 - 0, ) from the corresponding parameter 0 was 
calculated. For the second experiment, a new value for 
0 was generated and the process repeated, obtaining 
0 k and its squared deviation. For this experiment, © D 
and its squared deviation (0 - 0 D ) from the corre- 
sponding realization of 0 were also calculated. This 
was repeated twenty times, and each title, 0 was cal- 
culated using the present 0 as well as all previous 
maximum-likelihood estimates. Five hundred repetitions 
of this run of twenty experiments were then made, and 
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the averages of the squared deviations of and 0 Q 

were formed as estimates of E(0 - 0 ) 2 and E(0 - 0 ) 2 . 

k D 

Then the ratio R was calculated utilizing these 
estimated mean-squared errors. All numerical integrations 
were performed by means of the eleven-point Gauss 
quadrature formula. For details see Appendix A. 


This procedure was repeated for all types of Pearson 
prior distributions with varying coefficients of skewness 
and kurtosis. As with Clemmer and Krutchkoff [3], Lemon 
and Krutchkoff [17], and Martz and Krutchkoff [20], the 
ratio R was observed to be significantly influenced 
by the prior distribution only through the value 


Z = 


Var (0 k |0) 
Var ( 0 ) 


( 3 . 26 ) 


where * indicates that E(0), the prior mean of 0 , 
has been substituted for 0 . In particular, for a 
random sample of k observations from (3.1), the 
maximum-likelihood estimator is distributed according 
to (3.8), and the value Z becomes 


kE( 0 ) 
Var ( 0 ) 


(3.27) 


Since it was found that the only factors affecting the 
ratio R , apart from the number of experiences, are 
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contained in (3.27), this quantity can be conveniently 
used to summarize and index a given situation. 

It has been repeatedly observed in the Monte Carlo 
study that the sample size k has no effect on the ratio 
R when formed with 0 p or 0 p v . Therefore without 
loss of generality, k will be taken to be one. We will 
denote the ratio R when calculated with 0 p by R p 
and when calculated with © p v by R p y . 

To support the claim that the smooth empirical 
Bayes estimators are indeed robust to the form of the 
prior distribution, the values of 0 (i = 1,2, ••*,20) 
were generated from various distributions. For all types 
of Pearson prior distributions with varying coefficients 
of skewness and kurtosis, the ratio R has been observed 
to vary only slightly for a given value of n , providing 
the value of Z remains unchanged. This is illustrated 
in Figures 3-6 for a given value of Z = 2.0 . The solid 
line in each figure represents the ratio R p v calcu- 
lated with the smooth estimator 0 p v ; and the broken 
line represents the ratio R p calculated with 0 p . 

This representation will be used in the remaining figures 
of this chapter. The parameters of the prior distribu- 
tions are designated as follows: 
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E ( 0 ) = prior mean of 0 

V(0) = prior variance of 0 

S = skewness 

K = kurtosis 

In Figure 3 the prior distribution is bell-shaped 
(skewed); in Figure 4, L-shaped; in Figure 5, J-shaped; 
and in Figure 6, U-shaped. These estimators are evi- 
dently rather insensitive to the form of the prior 
distribution as summarized by S and K ; therefore, 
values of S and K will not be given for the remaining 
figures in this chapter. 

Values of the ratios R p and R p v are plotted 
in Figures 7 and 8 respectively. These values are 
plotted for different values of E(0) and V(0), summarized 
by Z . The values of Z range from 0.5 to 5.0. We 
note that as Z increases, the values of the ratios 
R p v and R p tend to decrease. This phenomenon is 
best understood by considering the summary quantity Z , 

A 

defined by (3.26). If Var(0 |0) is large as compared 

.K 

to Var(0), then the classical estimates of 0 will vary 

widely. The smooth estimators, however, are capable of 

"detecting" this variation and use this information to 

obtain "better" estimates of 0 . Conversely if 
/\ 

Var(0.|0) is small as compared to Var(0), then the 

.K 

classical method would be expected to do quite well . 
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In this case there is a great deal of information within 
an experiment, and previous experiments contribute very 
little information about the parameter. We also notice 
that for a given number of experiences, the decrease 
of R is more significant than that of R demon- 

Jr / V Jr 

strating the superiority of the smooth estimator 0 

F , V 

over the smooth estimator 0 p . This increase in mean- 
squared precision is easily explained. The prior density 
approximation used in 0 is based on a sequence of 

P z V 

values having a marginal distribution whose mean and 
variance are approximately equivalent to those of the 
prior distribution. The prior density approximation used 

/"W 

in 0 p , however, is based on a sequence of maximum- 
likelihood estimates having a marginal distribution whose 
mean and variance are known to overestimate those of the 
prior distribution. Hence we expect 0 to provide a 

P r V 

significant increase in squared-error precision over 0 p 
as seen by comparing the corresponding values of the 
ratios R p v and R p of Figures 7 and 8. 

Figure 9 shows a typical result obtained when 0 p v 

is iterated as described in section 3-4. The ratio 

R* „ formed with the iterated smooth estimator 0* „ 
p , v p , v 

is denoted by the dotted line. We note that a second 
iteration of 0 • slightly decreased any improvement 

P 9 V 

obtained by 0 . This decrease in precision is 

P f V 
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expected. The marginal density approximation used to 
represent the prior density in 0 is based on a 

sequence of estimates whose distribution has its mean 
and variance approximately equivalent to those of the 
prior distribution. The prior density approximation 
in 0^ v , however, is not known to have this 
property . 

In Figure 10 results from a second iteration of 
0 p are given when the parameters of the prior distribu- 
tion are identical to those used in Figure 9. The dotted 
line is used to represent the ratio R p based on the 

~ T 

smooth estimator 0 p . We note that in this case, 

iteration does improve the ratio R ; however, this 

improvement is not as significant as that obtained using 

0 given in Figure 9. It is conjectured that the 

P / v 

overestimation of the prior variance by the marginal 
distribution of © k has a far more significant effect 

on the prior density approximation than does the marginal 

<%/ 

variance of 0 p . In general, iteration of the smooth 

estimators is discouraged. While a second iteration of 

0 p does decrease the squared error, the decrease is not 

as significant as that obtained using 0 p v . A second 

iteration of 0 usually increases the squared error. 

P t v 

In all cases considered in the Monte Carlo study, 

the estimator 0 provided consistent improvement over 

P / v 
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the maximum-likelihood estimator for two or more past 
experiences. It was also observed to be the most 
efficient of the smooth estimators. Since this improve- 
ment was uniform over a wide variety of Z values, we 
are confident that 0 can be used in any situation. 

However, when some idea of the prior mean and variance 
is known. Figure 11 can be used to obtain an indication 
of the amount of improvement 0 p v will provide over the 
maximum-likelihood estimator. In this figure, the ratio 
R p v is plotted as a function of Z for a given number 
of experiences. Thus one can obtain some a priori idea 
of the improvement over maximum-likelihood that is likely 
to be obtained. 


















CHAPTER IV 


ESTIMATION OF THE WEIBULL SCALE PARAMETER 
WITH KNOWN SHAPE PARAMETER 

In this chapter continuously smooth empirical Bayes 
estimators are given for the scale parameter a in the 
two-parameter Weibull distribution. The scale parameter 
is assumed to vary randomly throughout a sequence of 
experiments and the shape parameter 6 is assumed to be 
known. It may, however, be different in each experiment. 
Results from Monte Carlo simulations are reported which 
show that even for small sample sizes and few experiments 
the smooth estimators have smaller mean-squared errors 
than the maximum-likelihood estimators. 

4 . 1 Maximum-Likelihood Estimator for a 

Let X be a random variable having a Weibull distri- 
bution with known shape parameter 3 . If the scale 
parameter a is a random variable, then the conditional 
density function of X is given by 

3 

f(x|a) = a3x^ -1 e ax , (x > 0; a, 3 > 0) . (4.1) 

Consider a random sample of k observations from 
(4.1). The likelihood function of this sample is 
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L( x | a) = 


(ae) 




( 4 . 2 ) 


and by defining 


k 



i=l 


( 4 . 3 ) 


can be expressed as the product 


L(x a) = Q(t a)c(x) 

/v 


where 


c(x) = 3 k J”[ xj 1 

i = l 


and 


Q(t | a) = a k e at 


Thus, the factorization criterion [ 23 ] assures that T is 


a sufficient statistic for a . 
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The Weibull distribution satisfies the necessary 
regularity conditions given in section 1.7 to allow the 
maximum-likelihood estimator for a to be found by the 
solution of equation (1.22). Therefore, taking the 
logarithm of (4.2) and differentiating with respect to 
a , we have 


d log L(x | a) 
da 


k 



i=l 


(4.4) 


Equating (4.4) to zero and solving for a , we obtain the 
maximum-likelihood estimator 





k 


k 


i=l 


(4.5) 


This estimator can clearly be obtained by a one-to-one 
transformation on the sufficient statistic T ; therefore 

A 

the maximum-likelihood estimator provides a sufficient 

statistic for each realization of a . For ease of 
notation, the subscript k representing the sample size 
will be suppressed. 



2j . 2 Distribution of the Maximum-Likelihood Estimator 
for a 
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According to (1.25), the asymptotic distribution of 

A 

the maximum-likelihood estimator a given the parameter 
a is normal with mean 


/x 

E(a| a) = a 


(4.6) 


and variance given by 


/s 

Var(a| a) 


/d 2 log L(x | a)\ 
A da 2 / 


(4.7) 


If we differentiate (4.4) with respect to a and form 
the expectation with respect to X , then 


E 



log L( x | a) 
da 



(4.8) 


Hence the variance becomes 


^ 2 
Var ( a | a) = 


(4.9) 
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and 


/\ 

distr . ( a | a) 



(4.10) 


Since the maximum-likelihood estimator is represent- 
able in closed form, its conditional distribution can be 
found for finite values of k . This distribution will be 
obtained through a series of variable changes and will 
require the use of moment-generating functions . 

Consider the distribution of the variable 

Y = X 3 (4.11) 

where the conditional density of X is given by (4.1). 
Since Y is either an increasing or decreasing function 
of X , depending on the fixed value 3 , the conditional 
density function of Y , say h(y|a), is given by the 
formula 


h(y | a) 


f (x | a) 


dx 

dy 


(4.12) 


in which X is to be replaced by its value in terms of 
Y given in equation (4.1). This formula represents a 
simple change of variable technique which can be found in 
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I-Ioel [ 1 4 ] . Based on formula (4.12), the conditional den- 
sity function of Y given a becomes the exponential 
density function 

h(y|a) = ae -ay 5 (y > 0; a > 0) . (4.13) 

Corresponding to this density function, the moment- . 
generating function is given by 

M y (0) = f ae _(a_9)y dy 

• o 

= • (4.14) 

i - £ 
a 

Since each x^i = l,2,***,k) is independently and 
identically distributed, the moment-generating function 
of T given a , M (0), can be written as 


m t (0) 


M 

X 



+ * 



M o (0) M o (0) 


3 


6 


M g(6) , 

x k 


and by ( 4 . 14) as 


m t (0) 




Now this function is precisely the moment-generating 
function for a gamma distribution with density function 
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f (t | a) 


-at / , x k- 1 
e ( at ) 


f(k) 


a 


(t > 0; k, a > 0) ; 

(4.15) 


therefore (4.15) represents the conditional density 
function of T given a . Based on this result, the 
conditional distribution of the maximum-likelihood esti- 
mator is easily obtained. If the previously described 
change of variable procedure is followed, the maximum- 

A 

likelihood estimator a can be shown to have the 
conditional density function 


/\ 

f ( a | a) 



f (k) ak 


(4.16) 


This density function is recognizable as an inverted gamma 
density function with mean 


E ( a| a) 


a k 
k-1 


(4.17) 


and variance 


A 

Var(a| a) 


(ak) 2 

(k-1) 2 (k-2 ) 


( 4 . 18 ) 
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Thus for finite sample sizes, the maximum-likelihood 

/N 

estimator a conditional on any a has the inverted 
gamma distribution given by (4.16). 

4 . 3 Smooth Empirical Bayes Estimators for a 

Assume that the data obtained in each of n previous 
experiments can be adequately described by a Weibull 
distribution with known shape parameter 6 . Further 

assume that between successive experiments, the scale 
parameter a varies randomly with the same but unknown 
prior density function g(a). If in each experiment a 
maximum- likelihood estimate cnCi = l,2,***,n) was obtained 
for each realization from g(a), then the sequence of 
sufficient and consistent estimates 

a i> a 2 J *“ 5 a n (4.19) 

can be used to form the marginal density approximation 



where h is given by (2.8). 
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Convergence in probability of (4.20) to the prior 

density g(a) as both the sample size k and the number 

of experiments n tend to infinity is assured by Theorem 

* 

2.2. Therefore when the kernel of integration is the 
inverted gamma density given by (4.16), p ( ot ) can be 
considered as a function of a and a continuously smooth 
empirical Bayes estimator for a becomes 


/ak\ 

n fa. . / , \ k+1 ( ) 

r — > I (n)/ak\ \ n/ 

sin \ 2h ) 

2 

da 

2 } k 

1=1 Ja u) 

la- a . \ 

LUf) . 

f a /ak\ 

| l/ak\ k+1 \ a n / 

2-f L aU") e 

i=l •'a ( 1} 

la-a^ \ 
sin \ 2h / 

2 

da 

/ a-a . \ 

LIth 1 ) j 


(4.21) 


where a.,, and a, . are the respective minimum and 
(1) (n) 

maximum values of sequence (4.19). 


The kernel of integration in (4.21) represents the 
conditional density function of the maximum-likelihood 

A 

estimator a given a for fixed and .finite values of the 
sample-size k . If the asymptotic normal density (4.10) 
is used as the kernel of integration instead of (4.16), 
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then an alternative smooth estimator for a can be given 

n 0 

by 



(4.22) 


For small sample sizes which are usually encountered 

in practical situations, the estimator a N is shown in 

section 4.6 to provide inferior results when compared 

with a Q . This is not too surprising since a N is based 

/\ 

on the asymptotic density of a given a rather than 

on the true density as is a . Therefore when 3 is 

known, we use a to estimate the scale parameter a . 

G 

The estimator a N , however, provides substantial squared- 
error improvements over the corresponding maximum-likelihood 
estimator and for reasons to be made apparent in the 
following discussion has been considered here. 

In some instances a closed form representation for a 
classical estimator a will be unattainable. The 
asymptotic distribution of the estimator may be known to 
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be normal. In which case a„ can be obtained. In section 
4.6 the ability of a N to provide good squared-error 
results is demonstrated and can be directly compared with 
that obtained from a_ . This comparison provides some 
indication about the degree of departure one can expect 
when a smooth estimator is based on the asymptotic 
distribution rather than the true distribution of a ' 
classical estimator. Obviously the degree of departure 
depends on the rate of convergence to the asymptotic dis- 
tribution as a function of k , the number of observations 
in each experiment. Hence no general conclusion can be 
reached. 

4 . 4 Smooth Empirical Bayes Estimators for a Corrected 
for Variance 

In section 4.3 the asymptotic distribution of the 

✓v. 

maximum-likelihood estimator a given any a was shown 
to be 

^ / 2 
distr.(a|a) = N ( a } 

This distribution corresponds to the distribution given 
by (2.21). Their respective means coihcide, and by setting 
c = 1 in (2.21) their variances become equivalent. Hence 
the transformation 

= a 1 ( a - E(a)j + E(a) 



a 


(4.23) 



defined by ( 2 . 25 ) can be performed on each element of 
sequence (4.19) thus forming a new sequence 


78 


* * 
“l 5 a 2 3 


* 

a 

n 


(4.24) 


having a marginal distribution whose mean and variance 
approximately equal those of the prior distribution. 
When the prior mean and variance of a are known, the 
constant a 1 in (4.23) is given by 


k Var(a) 

(1+k) Var(a) + E 2 (a) 


, 1/2 


otherwise 


a. 


-1 1/2 


1+k 


-2 

a 

■k - -4 


where 


n 

- a i 

a n n 


i=l 


and 


n 


i=l 


a . - a 

l 1 n ) 

n-1 


(4.25) 


(4.26) 


(4.27) 


(4.28) 
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Any density estimator P n ( a ) formed from sequence 
(4.24) has the property that 

% 

p *lim p (a ) = g(a) 

n-x» 

k->°° 

/ * % 

Therefore, when considered as a function of a , p n (a ) 

can be used to approximate the prior density function, 

and a smooth estimator for a becomes 

n 



(4.29) 

where a.,. and a, , are the respective minimum and 
maximum values of sequence (4.24). Since the asymptotic 
normal density is used as the kernel of integration in 
(4.29), « N v corresponds to the smooth estimator a N 

given by (4.22). 

The sequence (4.24) is obtained under the assumption 
that the conditional distribution of the maximum-likelihood 
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estimator is normal. For finite values of k , the actual 
distribution of this variable is known to be the inverted 
gamma distribution. Based on this distribution, a trans- 
formation similar to (4.23) can be obtained and used to 
develop a smooth estimator a which corresponds to 

G f V 

a_ . This transformation is found by following a 
G 

development similar to the one used in section 2.5 for 


The mean of the inverted gamma distribution is given 
by (4.17) and its variance by (4.18). If these values and 
the relations of conditional probability given by (2.22) 
and (2.23) are used, the mean and the variance for the 
marginal distribution of the maximum-likelihood estimator 

v\ 

a become 


E(a) 



k 


k-1 


E(a) 


y 


(4.30) 



Bl 


and 


Var (a) 


E 


(ak)' 


L(k-]J (k-2) J 


+ Var 


/ ak 

\k-l 


(k-1 ) A (k-2 ) 


E(a 2 ) + — — — - Var (a) 
(k-1) 2 


k‘ 


(k-1) ^ (k-2 ) 


(var(a) + E 2 (a)j + — — — — Var(a) 


(k-1)' 


— — — Var (a) (l + r-^r I + 


(k-1) 


k‘ 


k_2/ ’ (k-1) 2 (k-2 ) 


E 2 (a) 


(4.31) 


respectively . 

In order to obtain a sequence of values 


a 


1 J 



! 



(4.32) 


such that E(a') = E(a) and Var(a' ) = Var(a) 3 consider 
the transformation of a given by 


a = c 1 (a - E(a)) + E(a) (4.33) 

where c is a constant to be determined. Now 

= c^E(a) - c E(a) + E(a) = 


E ( a * ) 


E ( a) 


(4.34) 
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and 


Var ( a* ) 


c Var (a) 


(4.35) 


If (4.31) is substituted into (4.35) and E(a) is replaced 

by 


E(a) = ^ E(a) , 


then the variance of a' becomes 


Var ( a ' ) 


k‘ 


(k-1) 


2 Var(a) 1 + k^2 ) + f^T 


= c . 


k 2 Var(a) + E 2 (a)(k-1) 


(k-1) (k-2) 


(4.36) 


Hence by defining 


(k-1 ) ( k-2 ) Var(a) 

[k 2 Var(a) + E 2 (a)(k-1) J 


1/2 


(4.37) 


we obtain 


Var(a' ) = Var(a) , 


the desired result. 

The mean and variance of the prior distribution will 
general!.' remain unknown. Thus the constant c given 
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by (4.37) cannot be exactly determined. As in section 
2.5 this problem can be easily resolved by finding 
estimates for both £(a) and Var(a) . Since 


"I A 

!(a) = (k-l)/k E(a) , the sample 


mean a can be 
n 


used to approximate E(a); hence E(a) can be estimated by 


E(a) 


k-1 7T 

— i — a 
k n 


If in (4.31) Var(a) is replaced by the sample variance 

2 ^ 
s n and E(a) is replaced by E(a), then 


3 " = 777 Var(o) i 1 + K=v) + 


Solving (4.39) for Var(a) we obtain 


Var (a) 


(k-2 ) s^ - a 2 
L n n 


(k-; 
- 


as an es 


timate of Var(a). Proper substitution of (4.38) 


and (4.40) into (4.37) gives 


(k-2) 


(4.41) 
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Hence when the mean and variance of the prior distribution 
are unknown, (4.4l) can be used to represent c.^ in the 
transformation (4.33). 

With this transformation completely determined, 
sequence (4.32) is obtainable and a density estimator based 
on these estimates can be used to replace the prior density 
function giving the smooth estimator for a n 



(4.42) 

The limits of integration a' , and a! , are the 

(1) (n) 

respective minimum and maximum of sequence (4.32). 

4 . 5 Iteration of the Smooth Estimators 

Consider a sequence of smooth estimates from (4.21) 


a 


a 


G,l> G , 2 5 


a 


G , n 


(4.43) 



obtained in each of n past experiments. Based on this 

sequence, a more "precise" estimate of the realization 

a can often be determined. This estimate is obtained 
n 

by replacing the prior density in the Bayes estimator by 
a density approximation constructed with sequence (4.43). 
This approach represents an iteration of the smooth esti- 
mator oL and is denoted by a' since the kernel of 
integration remains unchanged. Similar iterations of 
the estimators v » a N , and “n v can be ob ' ta:i - ned 

and will be denoted by a' , a' , and a' „ 
respectively. Integration in each estimator is performed 
over the range of values obtained from the first iteration. 
For example, the region of integration for a* is from 
min(a .) to max(a„ .) where i,j = l,2,***,n and 

G , 1 G , J 

I ~ ^ 

i f j . For i = 1 , we define a = a. . 

G f X 1 

4 . 6 Monte Carlo Simulation 

The continuously smooth empirical Bayes estimators 
given in the previous sections were investigated for small 
samples by Monte Carlo simulation and compared to maximum- 
likelihood estimators. The simulation was conducted in a 
manner analogous to the procedure described in section 3-5. 
In each experiment a value of a was generated from a 
prior distribution belonging to the Pearson family of 

distributions. This parameter was then used to obtain a 
sample of size k from (4.1), and the maximum-likelihood 
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and continuously smooth empirical Bayes estimates were 
calculated. Five hundred repetitions of each of twenty 
experiments were made for each prior distribution, and the 
ratio 


empirical Bayes mean-squared error 
maximum-likelihood mean-squared error 


was formed. As in section 3-5, it was found that the ratio 
R was significantly influenced by the prior distribution 
only through the value 


Z 


A 

Var (a a) 
Var ( a) 


where * indicates that E(a), the prior mean of a , 
has been substituted for a . In particular for a random 
sample of size k from (4.1), Var(a|a) is given by (4.18), 
and the value Z becomes 


z . plM . (4.4iO 

(k-1) (k-2) Var(a) 

Since the only factors affecting the ratio R , apart 
from the number of experiences, are contained in (4.44), 
this quantity can be conveniently used to summarize and 
index a given situation. 
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As in Chapter III we wish to illustrate the robustness 

of the continuously smooth empirical Bayes estimators to 

the form of the prqor distribution. Accordingly, values 

of a i (i = 1,2, ••*,20) were generated from various 

Pearson prior distributions. The ratio R was observed to 

vary only slightly for a given value of n , providing 

the value of Z remained invariant from distribution to 

distribution. Illustrations of this fact are presented 

in Figures 12-15 for a given value of Z = 2.0 . In 

Figures 12, 13, 14, and 15 the prior distribution is 

bell-shaped (skewed), L-shaped, J-shaped, and U-shaped 

respectively. The solid line in each figure represents the 

ratio R_ ■ calculated with the smooth estimator a_ „ 
g, v g, v 

given by (4.42); the broken line represents the ratio R_ 
calculated - with the smooth estimator a\ given by (4.21). 
This hepresentation is used in the remaining figures of 
this chapter. The value Z is the same for all four 
figures, but skewness (S) and kurtosis (K) vary, giving 
different forms to the prior distributions. Little differ- 
ence, however, can be detected in the corresponding values 
of R_ „ and R in the four figures. Hence the smooth 
empirical Bayes estimators are quite insensitive to the 
form of the prior distribution as summarized by S and 
K . Therefore , the values of S and K will not be given 
for the remaining figures in this chapter. The values of 
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the prior distribution are designated as follows: 

E(a) = prior mean of a ; 

V(a) = prior variance of a 

To determine the effect of the sample size k on the 
ratios R and R , several runs were made with k 

ranging from 5 to 20. Results from these investigations 
revealed that the quantities in Z can vary in a manner 
not affecting the value of Z without having a signif- 
icantly noticeable effect on the ratios R_ „ and R_ . 

G T V G 

That is , the value of Z and not the individual quan- 
tities in Z determines the values of R G v and R Q . 
Figures 16, 17, and 18 illustrate this point. By com- 
paring these figures, it can be seen that although the 
parameters k , E(a), and V(a) vary widely in each 

figure, with the value of Z remaining 3.5, the ratios 
R g v and R g remain relatively unchanged. Since the 
ratio R remains relatively unchanged for equivalent 
values of Z , the individual quantities in Z will 
not be given for the remaining figures in this chapter. 

Values of the ratios R and R are plotted in 

G G f V 

Figures 19 and 20 respectively. These ratios are plotted 
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for different values of Z ranging from 0.5 to 5.0. 

Again we notice that as Z increases, the ratios R 

G f V 

and R decrease. ‘In particular for a given value of 
Z , the values of R_ „ are smaller than those of R_ , 
demonstrating the superiority of the smooth estimator 
a over the smooth estimator a_ . Again this increase 

G / V G 

in mean-squared precision is attributed to the prior den- 
sity approximation used in a_ . This approximation is 

G / V 

based on a sequence of values having a marginal distribu- 
tion whose mean and variance are approximately equivalent 
to those of the prior distribution. 

Figure 21 represents a typical result obtained when 
a_ „ is iterated as described in section 4.5. The ratio 

G / V 

R* „ formed with the iterated smooth estimator a* „ 

G r V G r V 

is represented by the dotted line. As in Chapter III 
we notice that the squared-error improvement achieved 
by using a_ is slightly decreased by a second itera- 

G , V 

tion. This decrease in precision is expected. The 

approximation used to represent the prior density 

in a' is based on a sequence of values whose marginal 

G , V 

distribution has its mean and variance approximately 
equivalent to those of the prior distribution. The 
prior density approximation in a' „ , however, is not 

G f V 

known to have this property. 



90 


Results from a second Iteration of a are given In 

G 

Figure 22 for a value of Z Identical to that used In 

f 

Figure 21. The ratio R formed with the smooth estimator 

G 

T 

a Is represented by the dotted line. We note that 
iteration of a G Increases the mean-squared precision; 
however, this Improvement Is not as significant as that 
obtained with a . It Is conjectured that the over- 

G , V 

estimation of the prior variance by the marginal distri- 
bution of the maximum-likelihood estimator has a far more 
significant effect on the prior density approximation than 
does the marginal variance of a_ . 

In the Monte Carlo study the estimator a_ pro- 

G f V 

vided uniform squared-error improvement over the maximum- 

likelihood estimator for two or more experiences. It was 

also observed to be the most efficient of the smooth 

estimators. Since this improvement was consistent over 

a wide variety of Z values, we recommend that the smooth 

estimator oT „ be used in all situations. 

G, V 

In Figure 23 the ratio R_ „ is plotted as a func- 
tion of Z for a given number of experiences. These 
plots give some indication of the improvement over maximum- 
likelihood one can expect if an estimate for the value of 
Z can be obtained. 
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As discussed In section 4.3, In some Instances the 
true density function of the maximum-likelihood estimator 
for a finite sample size k may be unknown. Its asymp- 
totic distribution may, however, be known and can be used 
to form an efficient smooth estimator. The estimators 
a N and a N v given by (4.22) and (4.29) respectively, 
were constructed in this manner. Values of the ratios 
R„ formed with oL and R „ formed with oL „ are 
plotted in Figures 24 and 25 respectively. These ratios 
are plotted for various values of Z ranging from 0.5 to 
5.0. Comparison of Figures 24 and 25 with Figure 20 shows 
that the estimators a„ and oL „ are not as "good" as 
the estimator a_ . However they do provide significant 
and uniform squared-error improvement over the maximum- 
likelihood estimator for two or more experiences, and this 
is of paramount importance.. 
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CHAPTER V 


ESTIMATION OP THE WEIBULL SHAPE PARAMETER 
WITH KNOWN SCALE PARAMETER 

In this chapter, smooth empirical Bayes estimators 
are given for the shape parameter 3 in the two-parameter 
Weibull distribution. The scale parameter a is assumed 
to be known and fixed in each experiment. Results from 
Monte Carlo simulations are reported which show that 
even for small sample sizes and few experiments the smooth 
estimators have smaller mean-squared errors than the 
maximum-likelihood estimators. 

5 . 1 Maximum-Likelihood Estimator for 3 

Let X be a random variable having a Weibull 
distribution with a known scale parameter a . If the 
shape parameter 3 is a random variable, then the 
conditional density function of X is given by 

f ( x 1 3 ) = a3x B-1 e" ax , (x >0; a, 3 >0) . 


(5.1) 

Consider a random sample of k observations from 
(5.1). The likelihood function of this sample is 
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k 3 

~ — ax 

L(x| 3) = (a3) k/ J^x^ -1 e 1 (5.2) 

i=l 

Upon taking the logarithm of (5.2), differentiating with 
respect to 3 , and equating to zero, we have 


d log L(x | 3 ) 
d3 



log x ± 


k 


a 7^ log x i 

i=l 


0 . 


(5.3) 


This equation can be solved to obtain the maximum- 

A 

likelihood estimator 3 . This may be accomplished with 
the aid of standard iterative techniques. In Appendix B 
such a procedure is described. 

The maximum-likelihood estimator is not known to be 
sufficient for 3 , and its exact distributional form 
is unknown. The estimator is, however, consistent for 
3 and by virtue of (1.25) is distributed asymptotically 
normal, with mean 3 and variance given by 


Var (3 | 3) 



log L(x|3) 


d3‘ 



(5.4) 
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Now 


d 2 log L ( x | 3 ) 
dg 2 


k 


k 

n 2 


- ay^x^ ( log x.)' 


i=l 


and the expected value of (5.5) is 


E 



log L( x | 3 ) 
dg 2 


~ - ctkE x 3 (log x) 2 

3 L 


Hence ( 5.^0 becomes 


A 

Var ( g | g ) = 

+ ctkE 

x 3 ( log x) 2 


_g 



Consider 


E 


x 3 ( log x) 2 


f 


x 3 (log x) 2 ctgx 3-1 e ax 


D 

If the substitution u = x p is made then 


x 3 ( log x)' 


f 

Jo 


a \ -au 

-7 u log (u) e 


( 5 . 5 ) 


. ( 5 . 6 ) 


( 5 . 7 ) 


dg . 

( 5 . 8 ) 


E 


du (5.9) 
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which after integrating by parts, becomes 


E 



(log x) 


2 



2 8 e -ctu log u du 
*0 


1 


+ i -au t 2 , 

e log u du 

'0 


a3‘ 


+ (log a - iKl)j 


- 2 (log a - ifi(l) 


) 


( 5 . 10 ) 


The function if>(l) in (5.10) is the digamma function 
ijj(x) evaluated at x = 1 . Completing the square in 
(5.10) and using the recurrence formula 
<Kx+l) = 1/x + \jj(x) , we obtain 


r 


r 2 / \ 2 

x e (log x) 2 

*1 

aB 2 

-g- + (log a - iKD-l) -1 


af$ L 


TT 

TT 


(<K 


2) - log 


-)*-= 


(5.11) 
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Hence (5.7) becomes 


Var (3 | 3 ) 





( 5 . 12 ) 


5 . 2 Smooth Empirical Bayes Estimators for 3 

Assume that the Weibull distribution with known scale 

parameter a adequately describes data obtained in each 

of n previous experiments. Further assume, that 

between successive experiments, the shape parameter 3 

varies randomly with the same but unknown prior density 

function g(3). If in each experiment a maximum-likelihood 
✓\ 

estimate 3 i (i = l,2,***,n) was obtained for each 
realization from g(3), then the sequence of estimates 

e ia e 2 , a n (5.13) 

can be used to form an approximation to the marginal 
density. This approximation can be represented by 


p„ <6) 



(5.14) 


where h is given by (2.8). 



Ill 


Convergence in probability of (5.1*0 to the prior 
density g(3) as both the sample size k and the number 
of experiments n 'tend to Infinity is assured by 
Theorem 2 . 2 . Therefore, when the kernel of integration 
is normal with mean 3 and variance given by (5.12), 
P n (3) can be considered as a function of 3 , and a 
continuously smooth empirical Bayes estimator for 3 n 
becomes 



where 3 /n , and 3, , are the respective minimum and 
(1) (n) 

maximum values of sequence (5.13). 

Since the true density function of 3 , when given 
3 , is unknown for a finite sample size, its asymptotic 
distribution has been used in (5.15). Such substitutions 
were considered in Chapter IV, and results from Monte 
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Carlo simulation indicated that substantial improvements 
in squared-error precision over maximum-likelihood could 
be obtained. 

Since 3 is not known to be a sufficient statistic 
for 3 , we cannot write E(3|x) = E(3|3) ; thus 3 N 
involves a further degree of approximation to the Bayes 
estimator. In section 5-5, results from Monte Carlo 
simulation indicate that 3 N has smaller mean-squared 
errors than the maximum- likelihood estimator even though 
3 n involves these approximations. 

5.3 A Smooth Empirical Bayes Estimator for 3 Corrected 
for Variance 

/\ 

The asymptotic distribution of 3 conditional on 
3 was shown to be 


distr . ( 3 | 3 ) = N^3, (5.16) 

whe re 

c = (^(2) - log a | 2 

This distribution corresponds to the distribution given 
by ( 2 . 21 ). Hence from the calculations in section 2.5, 
the mean of the marginal distribution of 3 is equal to 


the mean of the prior distribution. Its variance. 
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however, overestimates the prior variance by the amount 


Var(3) + E 2 ( 3 ) 
We can now form the transformation 


_T 

kc 


* 

3 


a. 



+ E(3) 


(5.17) 


defined by (2.25). This transformation provides us with 
a new sequence of values 


3 


* 

1 » 




( 5 . 18 ) 


The marginal distribution of this sequence has a mean 
and a variance equivalent to those of the prior distri- 
bution. When the mean and variance of the prior 
distribution are known, the constant a in (5.17) is 
given by 


kcVar ( 3 ) 


1/2 


(1+kc) Var(3) + E 2 (3) 


(5.19) 


Otherwise 


a 


l 



( 5 . 20 ) 
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where 


and 



1=1 


2 

s 

n 


L 

■? -i 


VN 



2 


n - 1 


( 5 . 21 ) 


( 5 . 22 ) 


Any density estimator P n (3 ) based on sequence 
(5.18) has the property that 

& 

p 11m p n (3 ) = g(3) 

n ->oo 

k-H» 

Therefore when considered as a function of 3 , 

P n (3 ) can be used to approximate the prior density 

function, and a smooth estimator for 3 becomes 

* n 



(5.23) 

^ & 

where 3,,, and 3, , are the respective minimum and 
ID in) 

maximum values of sequence (5.18). 
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5 • ^ Iteration of the Smooth Estimators 

Consider a sequence of smooth estimates from (5*15) 


N , 1 * P N , 2 5 


• • • , 3 


N , n 


(5.24) 


obtained in each of n previous experiments. Based on 
this sequence a more "precise” estimate of the realiza- 
tion 3 can often be determined. This estimate is 

n 

obtained by replacing the prior density in the Bayes 

estimator by a density approximation constructed with 

sequence (5.24). In essence this represents an iteration 

of the smooth estimator 6 , since the kernel of 

N * 

^ T 

integration remains unchanged and is denoted by 3 N . 

/N/ 

A similar iteration of the estimator 3 N can be 

obtained and will be denoted by 3 N v . Integration in 

each estimator is performed over the range of values 

obtained from the first iteration. For example, the 

'* s “' | 

region of integration for 3„ is from min(3„ .) to 

° N N , l 

max(B T .) where i,j = l,2,***,n and i =j= j . When 
i = 1 , we define 3 N x = § 1 . 

5 . 5 Monte Carlo Simulation 

The continuously smooth empirical Bayes estimators 
given in the previous sections were investigated for 
small samples by Monte Carlo simulation and compared to 
the maximum-likelihood estimator. In each experiment 
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a value of 3 was generated from a prior distribution 
belonging to the Pearson family of distributions. This 
parameter was then used to obtain a random sample of 
k observations from (5.1). The maximum-likelihood and 
continuously smooth empirical Bayes estimators were then 
calculated. Five hundred repetitions of twenty experi- 
ments were made for each prior distribution, and the 
ratio 


r _ empirical Bayes mean-squared error 

maximum-likelihood mean-squared error 

was formed. As in section 3*5, it was found that the 
ratio R was significantly influenced by the prior 
distribution only through the value 

7 _ Var*( 3 1 3 ) 

Var ( 3 J 


where * indicates that E($), the prior mean of 6 , 
has been substituted for 3 . In particular for a 
random sample of k observations from (5.1), the value 
of Z becomes 


E 2 ( 3 ) 


Z 


k Var(3) 


(5.25) 
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where 4>(2) = .4227843351 is the value of the digamma 
function i p(x) evaluated at x = 2 . Since the only 
factors affecting the ratio R , apart from the number 
of experiences, are contained in (5.25), this quantity 
can be conveniently used to summarize and index a given 
situation . 

As in the preceding chapters, we again wish to 

illustrate the robustness of the smooth estimators to 

the form of the prior distribution. Therefore values 

of ^ (i = 1,2, •••,20) were generated from various 

Pearson distributions while holding the value of Z 

fixed at 1.5. The results are shown in Figures 26-29. 

In each figure the coefficients of skewness (S) and 

kurtosis (K) were varied to give different forms to the 

prior distribution. In Figure 26 the prior distribution 

is bell-shaped (skewed); in Figure 27, L-shaped; in 

Figure 28, J-shaped; and in Figure 29, U-shaped. The 

solid line in each figure represents the ratio R N v 

calculated with the smooth estimator 8 „ , and the 

n , v 3 

broken line represents the ratio R N calculated with 
8 n . This representation will be used in the remaining 
figures of this chapter. Again we notice that the 
smooth estimators are quite insensitive to the form of 
the prior distribution as summarized by S and K . 
Therefore values of S and K will not be given for 
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the remaining figures in this chapter. The values of 
the prior distribution are designated as follows: 

E(3) = prior variance of 3 , 

V($) = prior mean of 3 

To determine the effect of the sample size k on 
the ratios R „ and R„ , several runs were made 
with k ranging from 5 to 20. Results from these 
investigations revealed that the quantities in Z 
can vary in a manner not affecting the value of Z 
without having a significant effect on the ratios 
R., „ and R, . Thus the value of Z , not the 
individual quantities in Z , determines the values 
of R n v and R n . Figures 30, 31, and 32 illustrate 
this point. By comparing these figures, it can be seen 
that, although the parameters k, a, E(3), and V(3) 
vary in each figure with the value of Z remaining 
0.8, the ratios R N v and R N remain relatively 
unchanged. Since the ratio R remains relatively • 
unchanged for equivalent values of Z , the individual 
quantities in Z will not be given for the remaining 
figures in this chapter. 

Values of the ratios R„ and R„ „ are plotted in 

N N ,V 

Figures 33 and 34 respectively. These ratios are plotted 



119 

for different values of Z ranging from 0.5 to 5.0. 

As in the previous chapters , we notice that as Z 
increases, the ratios R XT and R Tr decrease. In 
particular for a given value of Z , the values of R N v 
are smaller than those of R„ , demonstrating the super- 

N 

iority of the smooth estimator 8 N v over the smooth 
estimator 8 N . As in the preceding chapters, the 
increase in mean-squared precision by 3 N v is 
attributed to the prior density approximation. This 
approximation is based on a sequence of values whose 
marginal distribution has its mean and variance approx- 
imately equivalent to those of the prior distribution. 

Figure 35 represents a typical result obtained 

when 3„ „ Is iterated as described in section 5.4. 
n , v 

The ratio R ' „ formed with the iterated smooth esti- 
N,V 

mator 3 N v is represented by the dotted line. As 
in the previous chapters, we notice that the squared- 
error improvement achieved by using 3 N v is slightly 
decreased by a second iteration. This decrease in 
precision is expected. The approximation used to 
represent the prior density in 3 N v is based on a 
sequence of values whose marginal distribution has its 
mean and variance approximately equivalent to those of 
the prior distribution. The approximation used in 3 N , 
however, is not known to have this property. 
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Results obtained from a second Iteration of B T 

N 

are presented In Figure 36 for a value of Z identical 
to that used in Figure 35. The ratio formed with 

the smooth estimator 3^ is represented by the dotted 
line. We note that although iteration of 6 N does 
increase the mean-squared precision, this increase is 
not as significant as that obtained when $„ Tr is used. 
It is conjectured that the overestimation of the prior 
variance by the marginal distribution of the maximum- 
likelihood estimator - has a far more significant effect 
on the prior density approximation than does the mar- 

/-w 

ginal variance of 3 N • In general, iteration of the 
smooth estimators is discouraged. While a second 
iteration of 3 N does decrease the squared error, the 
decrease is not as significant as that obtained using 
B n v . A second iteration of B N v usually increases 
the squared error. 

In all cases considered in the Monte Carlo study, 

the estimator B N v provided consistent mean-squared 

improvement over the maximum-likelihood estimator for 

two or more experiences. It was also observed to be 

most efficient of the smooth estimators. Since this 

improvement was uniform over a wide variety of Z 

values, we are confident that B„ T „ can be used in 

N , v 

any situation. In case, however, some idea of the 
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prior mean and variance is known. Figure 37 can be 
used to obtain some indication on the amount of improve- 
ment of 3 N v over the maximum-likelihood estimators. 

In this figure, the ratio R N v is plotted as a function 
of Z for a given number of experiences. 


























CHAPTER VI 


ESTIMATION IN THE WEIBULL DISTRIBUTION 

In this chapter, smooth empirical Bayes estimators 
are given for the scale parameter a and the shape 
parameter 3 in the two-parameter Weibull distribution. 
These estimators are proposed on the assumption that 
they are subject to random variation. Results from 
Monte Carlo simulations are reported which show that 
the smooth estimators have smaller squared errors than 
the maximum-likelihood estimators. 

6 . 1 Maximum-Likelihood Estimation 

Let X be a random variable having a Weibull 
distribution. If a and 3 , the scale and the shape 
parameters respectively, are random variables, then the 
conditional density function of X is given by 

ft - 1 ^ 

f(x|a,3) = a3x p e -ax , (x > 0; a, 3 >0) 

( 6 . 1 ) 

Consider a random sample consisting of k observa- 
tions from (6.1). The likelihood function of this 
sample is 

k _ ax 3 

L ( x 1 3 ) = (a3) k/ |~|*x^ -1 e 1 . (6.2) 

i=l 

134 
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Taking the logarithm of ( 6 . 2 ), differentiating with 
respect to a and 3 in turn and equating to zero, we 
obtain the equations 


9 log L 
9a 


k 



i=l 


(6.3) 


and 


9 log L 
96 



k 

Z loe x i 

i=l 


k 


“Z'S 

i=l 


log x i 


0 . 


(6.4) 


Eliminating a between these two equations and simpli- 
fying, we have 


S x ? lo £ X H 

i=l 1 1 




IZ 


log x ± 


(6.5) 


which may now be solved to obtain the maximum-likelihood 
estimator 6 . This can be accomplished with the aid of 
standard iterative procedures . In Appendix B such a 
procedure is described. 

With 6 thus determined, a is estimated from 


(6.3) as 
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a 


k 


k 


x' 


1=1 


( 6 . 6 ) 


The maximum-likelihood estimators a and 3 are 
not known to be jointly sufficient, and their exact 
distributional form is unknown for small samples. The 
estimators are consistent and by virtue of (1.29) are 
distributed bivariate normal with mean vector 

A As 

U = (a, 3) and covariance matrix V given by 




(6.7) 


Thus 


As As 

distr . (a, 3 | a, 3 ) 


N (y , V) 

Z ~ 


( 6 . 8 ) 


The expected value of the mixed partial derivatives 
in (6.7) can be represented as 


E 


( it log l \ 

\ 3a3 0 / 


k 

log x i ) 

i=l 


- k E(x B log x) 


(6.9) 
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since each x. is independently and identically distrib- 
uted according to (6.1). Now 


*°° g 

E(x^ log x) = (x^ log x)aSx^ -1 e _ax dx (6.10) 

« 


which becomes 


E(x e log x) = ^ (ip (2) - log a) 


( 6 . 11 ) 


after integrating (6.10) by parts. The function ip (2) 
is the digamma function ip(x) evaluated at x = 2 . 
Thus 


-E 



5F l' p(2) ~ log a ) » 


( 6 . 12 ) 


and the inverse of the covariance matrix can be written 


V 


-l 




(6.13) 


where A = ip(2) - log a and the main diagonal elements 
are given by the reciprocal of (4.9) and (5.12) 
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respectively. The covariance matrix V can now be 
given as 



(6.14) 

6 . 2 Smooth Empirical Bayes Estimation 

Assume that an unobservable random parameter vector 
y = (a, 3) occurs according to the bivariate density 
function g(u). When y is realized, an observable 
random vector X from (6.1) occurs and a maximum- 
likelihood estimate of y is formed. Now if this 

/v 

situation occurs repeatedly, then the vector-valued 
sequence 


U-, , , * * * , P n (6.15) 


of maximum-likelihood estimates can be used to form a 
marginal density approximation p (y). Convergence in 

probability of p (y) to the prior density g(y) is assured 

n ~ ~ 

by Theorem 2.2. Hence, p -(y) can be considered as a 
function of y , and a continuously smooth empirical 
Bayes estimate of can be given. Here as in 

A 

Chapter V, the true density function of y given y is 
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unknown for small samples, and y Is not known to be 

A 

jointly sufficient for y . Thus E(ylx) 4= E(yly) . 
Nevertheless, a smooth estimator for y can be given, 
and in section 6.3 its ability to provide squared-error 
improvement over maximum-likelihood will be illustrated. 

In the remainder of this chapter, it will be advan- 
tageous to omit vector notation, restricting our attention 

A. 

to the respective components of the vectors y and y . 
Sequence (6.15) will now be written as 

A A A A A A 

( « 1 , 3 1 ), (a 2 , 3 2 ), •••> (a n , 3 n ) . (6.16) 

Using the multivariate density estimators proposed 
by Martz [19], we form 



(6.17) 

A A 

the marginal density estimator for p(a, 3). In a prac- 
tical situation, the maximum-likelihood estimates will 
generally be unitized quantities. These units can be 
removed from the arguments of the sine function in 
(6.17) by defining 
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h 


= s n 
a 


- 1/5 


(6.18) 


and 

h g = s e n" 1/5 (6.19) 


where 


n 

s^ = y^(a. - a) 2 /n , 

i=l 

■ hh 

±=i 

- 3 ) 2 /n 



(6.20) 

with 



n 

“ ■ • 
i=l 

JL 3 

" • Z-i 

1=1 

(6.21) 


Considering p n (a, 3) as a function of a and 3 
and using (6.8) as the kernel of integration, we obtain 
the smooth estimators 


a 


N 



£ 


/\ /N 




ae 


2 ( 1-p ) 


a a 0 S 1/1 ' p 


e 2ll - t ” 2 
°c. a e ^ - p 


p (a, 3) da dg 


p ( a , 3 ) da d3 

n 


( 6 . 22 ) 



lUl 


and 


N 


n f & ( n) f a ( n) 

E. L L 

i=l •'P , , x -/a 


$ 


3 d) ■'“(d 


D _ 2 ( 1— p ) 

- — — p (a, 3) da d3 

0 a. ynr n 

a p 


n 


i=i j 


3 


(n) 


a , 


$ 


(n) 2 ( l-p ) 2 

-—= — p (a, 3) da d3 

o i a a v 1 - p 

B (1 , •'c (1) a 8 


(6.23) 


for a^ and 3 n respectively. Here 


a 




+ (£) 1 


1/2 


3 / tt' 


,-1/2 


P = 


where 


-A 


, 1/2 * 


IV + a2 


$ = 


c 2 - 2p?n + n 2 


A = ip(2) - log a , 5 = 


a - a 
n 


all + A 



1/2 


2 / TT 


n = 


3-3 

n 


and 
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n 


P n (a 3 3) 


2irnh h„ 

« 6 1= 1 


E 


sm 


'a-aA / 3 — 3 / 
1 ' sin ' 1 


2h 


2h 


3 


a-a . 

i 

2h 


6 - 6 . 

,2h7 


(6.24) 


where h a and h^ are given by (6.18) and (6.19) respec- 
tively. The limits of Integration in (6.22) were obtained 

/\ 

by ordering the components of sequence (6. 16) for a and 
6 respectively. 

The smooth estimators oL and 6„ require the 
estimation of bivariate density functions which are 
difficult to estimate accurately. Also, to obtain point 
estimates from each of these estimators, double integra- 
tion must be performed over regions which have been 
approximated from sample data. To avoid a substantial 
decrease in mean-squared precision which could be created 
by the compounding effect of such approximations, we 
propose the use of marginal empirical Bayes estimators 
as a possible alternative. 


The smooth empirical Bayes estimator a N has been 

^ /v 

used to represent an approximation to E(a[a, 6), even 

/v a* 

though a and 6 are not jointly sufficient for a 
and 6 . We extend this approximation further by 
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constructing a continuously smooth estimator for a 
based on the marginal Bayes estimator E(a|a). Similarly, 
a continuously smooth estimator for 3 will be based on 
the marginal Bayes estimator E($|3). By constructing 
such estimators, we are tacitly assuming that a and 

A 

3 are independently distributed and marginally 
sufficient for a and 3 respectively, and that a 
and 3 are independently distributed. The results of 
such approximations are considered in section 6.3. We 
remark that the use of marginal empirical Bayes estima- 
tors is partly motivated by the results Clemmer and 
Krutchkoff [3] obtained with such approximations. 

The marginal empirical Bayes estimator for a can 
therefore be written as 


a 


M 




p (a) da 
n 


_ 1 _ 

a 

a 



• p n ( ot) da 


(6.25) 


6 



where p (a) is given by 


1HH 


p n ( a) 


2 rah 
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i=l 


( a- a . ' 

2h^ 
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(6.26) 


The marginal empirical Bayes estimator for 3 is given 


by 


M 


it 

i=l 


(n) 


( 1 ) 


l/V* 5 ’ 

2 \ a 


P ' p ( 3) d3 

n 


n |hn> 

z . 

1=1 •'hi) 


where p n (3) is given by 


l/V 6 


1 ^ \ & Q ) 

h e p n (6) de 
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( 6 . 27 ) 
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( 6 . 28 ) 
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6 . 3 Monte Carlo Simulation 

The continuously smooth empirical Bayes estimators 
a and g were Investigated by Monte Carlo simulation 
and compared to the maximum-likelihood estimators. The 
criterion for comparison was mean-squared error, and 
therefore the ratio 

£ _ empirical Bayes mean-squared error 

maximum-likelihood mean-squared error 

was of interest. Here the notation a and g is used 
to represent any of the smooth estimators given in the 
preceding section for the parameters a and g 
respectively . 

Values of a and g were generated from chosen 
prior distributions. Then a random sample of k obser- 
vations corresponding to the realizations a and g 
were generated by (6.1). The maximum-likelihood estima- 

/S. A 

tors a and g were found, and their squared deviations 
from the values of the corresponding parameters were 
calculated. For the second experiment new values of 
a and g were generated. The procedure was repeated 

A A 

with a and g , and their squared deviations were 
calculated. For this experiment, a and g and their 
squared deviations were also calculated. This procedure 
was repeated 20 times, and each time a and g were 



calculated using the present values of a and 3 as 
well as all previous maximum-likelihood estimates. Five 
hundred repetitions of this run of 20 experiments were 

then made, and the average of the squared deviations 

~ ~ ~ ~ 

of a , 3 , a , and 3 were formed as estimates of 
E(a - a) 2 , E(3 - 3) 2 , E(a - a) 2 and E ( 3 - 3) 2 respec- 
tively. Then the ratios R^ and R^ were calculated 
utilizing these estimated mean-squared errors. 

In the Monte Carlo simulation, numerical integration 
was performed using the Gauss quadrature formula 
described in Appendix A. The numerical integrations in 
the smooth estimators a„ and 3„ were calculated to 
a desired degree of accuracy by means of halving the 
intervals of integration as discussed in Appendix A. 

The numerical solution of several double integrals is 
required to form the smooth estimators a„ and 3., . 
Therefore in the Monte Carlo simulation, checking for 
convergence proved to be extremely time consuming. 

Several runs were made in which the integrals were 
calculated without requiring the time-consuming accuracy 
check. The results from these runs were directly com- 
pared to those obtained when accuracy checking was 
employed. This comparison showed that the results 
almost always were comparable, to three significant 
digits, and in cases where they differed, the 
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mean-squared errors of a N and 3 N , formed without 
consideration of integral convergence, were only slightly 
greater than those obtained when integral convergence 
was considered. Thus any bias associated with the 
ratios R or R 0 formed with these mean-squared errors 

dp 

would be in favor of the maximum-likelihood estimators. 
Since this bias was slight and computer run time was 
significantly reduced, any gain in precision by checking 
for accuracy was sacrificed in favor of the reduced 
run time . 

The parameters a and 3 were generated indepen- 
dently in the Monte Carlo studies. The theory does not 
require that a and 3 be independently distributed; 
however for ease in obtaining random values from the 
Pearson distributions this was taken to be the case. As 
in the previous chapters, results indicated that the 
ratio R depended on the distributions only as they 
influenced the value of the ratio of the conditional 
variance of the maximum-likelihood estimator to the 
variance of the parameter, given the corresponding 
parameter value. Thus, for os 


Z 

a 


E 2 ( a) 


1 + 


k Var(a) 



(6.29) 
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where A = ip( 2) - log E(a) and ip(2) = .4227843351 ; 

and, for 8 , 


Z 


8 


E 2 (B)(^) 
k Var(8) 


( 6 . 30 ) 


In (6.29) and (6.30), the values of the parameters a 

and 8 have been replaced by their expected values. 

Apart from the number of experiences, the only factors 

found to affect the ratios R and R D were contained 

a 8 

in and Z^ respectively. Therefore, these 

quantities can be conveniently used to summarize and 
index a given situation. In particular, it was found 
that for a given value of Z , the ratio R remained 
relatively unchanged regardless of the correlation 
between the maximum-likelihood estimators. 


In order to support the claim that the smooth 
estimators are indeed robust to the form of the prior 
distribution, the parameters a and 8 were indepen- 
dently generated from various Pearson distributions. 

For all types of Pearson prior distributions with varying 
coefficients of skewness and kurtosis, the ratio R 
and R d have been observed to vary only slightly for a 

p 

given number of experiences, providing the values of 
Z^ and Zg remain unchanged. Illustrations of this 



149 


fact are presented in Figures 38-41 for given values of 

Z = Z = 2.0 . For convenience the same coefficients 
a 3 

of skewness (S) and kurtosis (K) were used in each 
situation; however the first four moments of the prior 
distribution were different. In Figures 38, 39 s 40, 
and 4l, the prior distribution is bell-shaped (skewed), 
L-shaped, J-shaped, and U-shaped respectively. The 
solid line represents the ratio R „ calculated with 

a ,N 

the smooth estimator given by (6.22), and the 

broken line represents the ratio R^ N calculated with 
the smooth estimator a D given by (6.23). The 

p 

parameters of the prior distributions are designated as 
follows : 


E( a) 

= prior 

mean of 

a 

V( a) 

= prior 

variance 

of 

E(3) 

= prior 

mean of 

3 

V(B) 

= prior 

variance 

of 


We remark that the ratios R w and R. w calculated 

a ,m B ,m 

with the marginal empirical Bayes estimators a M and 
8 m given by (6.25) and (6.27) respectively are also 
robust to the form of prior distribution. 

In the Monte Carlo study it has been repeatedly 
observed that the maximum-likelihood estimates vary 
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widely for a random sample consisting of less than 
15 observations. These fluctuations caused the aver- 
aged squared errors over 500 repetitions to represent 

^ 2 ^ 2 

poor approximations to E(a-a) and E(3-3) . Therefore 
in each case reported in this section, k was fixed at 
20. For k > 15 we observed that for a given value 
of Z , the ratio R was unaffected by choice of 
sample size. Thus restricting k to 20, causes no loss 
in generality. 

The results given above were based on the assumption 
of independent prior distributions for a and 3 . A 
question which naturally arises is: What effect does 

correlation between a and 3 have on the ratios R 

a 

and Rg ? To answer this question a bivariate normal 
prior distribution was assumed for the parameters a 
and 3 . For given values of Z and Z. it was found 
that as the correlation p between a and 3 increased, 
the ratios R „ and R a „ decreased. Also the same 
degree of positive and negative correlation gave similar 
results. For example, correlations of p = 0.5 and 
p = -0.5 gave similar results for equivalent values 

of Z and Z. . 

a 3 

Figures 42-47 illustrate the increase in mean- 
squared precision achieved by the smooth estimators a N 
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and g N over the maximum-likelihood estimators a and 

A 

3 for various values of p and Z . Figures 42, 43, 
and 44 Illustrate the improvements achieved by a N when 
p = 0 . 0 , p=0.5 , and p = 0.9 respectively for 

several values of Z . Figures 45, 46, and 47 illustrate 

a 

the improvement achieved by 3 N when p = 0.0 , 

p = 0.5 , and p = 0.9 respectively for several values 

of Zg . We note that for a given value of Z^ , as 

p increases, N decreases. Similarly for a given 

value of Z„ , as p increases, R„ decreases. 

3 3 ,n 

The amount of correlation between a and 3 has 
been observed to have no effect on the ratios R 

a , M 

and R c „ formed with the marginal estimators a„ and 
3 , M ° M 

3 m respectively. This is, of course, expected since 

the smooth estimators were based on the assumption of 

independent prior distributions. Values of the ratios 

R „ and R Q „ are plotted in Figures 48 and 49 

respectively for various values of Z^ and Z^ 

ranging from 0.5 to 5.0. It is of particular importance 

that for a given value of Z^ and a fixed number of 

experiences, the value of R „ is less than that of 
r a ,M 

R _ _ regardless of the value of p . Similar results 

are witnessed by observing corresponding values of R^ M 

and R„ „ . Thus the marginal smooth estimators a w 
3 ,n ° M 

/•'W 

and 3 m provide "better" results than the estimators 
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a and 3., even though the former are based on the 
assumption of independent prior densities. 

It is conjectured that the compounding effect of 
being unable to accurately approximate bivariate den- 
sities and having to perform double integration over 
regions approximated from sample data cause this 
phenomenon. The errors produced by such approximations 
in and 3 N appear to be far more significant than 

the errors produced by the independent assumptions on 
which the marginal estimators a M and B M are based. 

^ #"W 

We are confident that a., and g„ will give "best" 
results in any situation since they provided uniform 
improvement over a N and 3 N regardless of value of 
Z or the degree of correlation between a and 3 . 
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CHAPTER VII 


COMPARISONS WITH TWO ALTERNATIVE EMPIRICAL 
BAYES ESTIMATORS 


In this chapter two alternative empirical Bayes 
estimators are considered. Where applicable, these 
estimators are applied to the distributions considered 
in the preceding chapters. Results from Monte Carlo 
simulation with these estimators are reported and directly 
compared with the results obtained from the corresponding 
continuously smooth empirical Bayes estimators. 

7 . 1 Alternative Empirical Bayes Estimators 

The methods of empirical Bayes estimation can be 
partitioned into two distinct classes. The first class 
consists of those methods which attempt to obtain empirical 
Bayes estimators without requiring explicit estimation of 
the prior distribution. The well-defined families devel- 
oped by Rutherford and Krutchkoff [28] typify this 
technique. The second class consists of those methods 
which endeavor to obtain empirical Bayes estimators by 
considering an approximation to the prior distribution 
function. The method proposed by Lemon and Krutchkoff [17] 
demonstrates this technique. In a well-defined sense, the 
continuously smooth estimator does not belong to any of 
these classes. The continuously smooth estimator, 
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however, can be considered an analog to the second method 
of estimation since It too attempts to approximate some 
form of the prior distribution. 

Rutherford and Krutchkoff [28] established several 
well-defined families of distribution functions. Each 
family provides a unique empirical Bayes estimator 
E (0|x) having the property that 

p lim E (0|x) = E(0|x) (7.1) 

n / ' s - / ~ 

n+°° 

for all x . This property was considered desirable since 
previously [ 29 ] they had shown that e-asymptotic opti- 
mality could be obtained by a truncated version of 
consistent estimators for the Bayes estimator. In prac- 
tical situations, e-asymptotic optimality is equivalent 
to asymptotic optimality which is defined by 

p lim R(5 , G) = R(G) . (7.2) 

n-*» 

Here R(»,*) represents the overall risk, £ n an empirical 
Bayes decision function, and R(G) the Bayes risk. These 
elements were described in detail in section 1.4. 

In particular we will be concerned with families 
F 1 and P 3 . For completeness we include their defini- 
tions. A family of distributions |F(x|0):0e0} is said 
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to be a member of F if 

i) the random variable 
0 , and 

ii) the probability mass 


X is discrete for each 


function P(x|0) is such that 


P (x + 1 | 9) 
P(x| 0) 


a(x) + 0b (x) 


where a(x) and b(x) are any functions such that 
b(x) 4 0 . If P (x) and P (x + 1) are consistent esti- 
mators for the marginal probability mass function P(x), 
then a consistent empirical Bayes estimate of 0 n is 
given by 

* P (x + 1) a(x ) 

ft* = n n n 

°d b C x ) P ( x ) ~ b ( x ) 

n n n 

A family of distributions |F(x|0):0e0} is said to be 
a member, of F 2 if 

i) X is a continuous random variable for all 
0 £ 0 , and 

ii) the probability densities f(x|0) are such that 


5f ( x | 0 ) 

dx 


f (X | 0) 


a(x) + 0b(x) 
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where a(x) and b(x) are any functions such that 

b(x) + 0 . If f (x) and f ? (x) are consistent estimators 
1 n n 

for the marginal density and its first derivative 
respectively, then a consistent empirical Bayes estimate 
of 0 n is given by 


,« f n (x n> a(x „> 

D bU n ,f „ (x n ) ' 


For a discussion on estimating f (x), see Rutherford [27]. 


Lemon and Krutchkoff [17] proposed a general 
smoothing technique for obtaining empirical Bayes 
estimators. One particular estimator is essentially 
obtained by the replacement of the prior distribution 
by a step function having steps of equal height 1/n at 
each of n previous classical estimates. This estimator 
can be represented by 


/x 

/N 



n 


LV'eJep 


n 


£ f(e le ) 

i=l n 1 


(7.3) 


where 0 i represents the 1th classical estimate of 0 
and f ( * | * ) is the kernel of integration. 
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Also suggested was a second Iteration of 0 D . 
This Iteration gives 


0 


D 


re n .f(e |e . ) 

D , i n 1 D , i 
n. ✓s. ^ 


(7.4) 


/V 

/\ 

where 0 . is given by (7.3). Further iterations are, 

of course, possible; however, as with the continuously 
smooth estimators, they may provide a significant increase 
in squared error and thus may be undesirable. 


7 . 2 The Poisson Distribution 

Let us assume that in each experiment, a single 
observation x i (i = l,2,***,n) is obtained from the 
Poisson mass function given by (3.1). From (3* 4) the 

/\ 

maximum-likelihood estimate of 0 is simply x or 0 . 
The subscript k has, for convenience, been deleted. 


The Poisson distribution is readily verified to be 

A 

a member of family F_ L in which a(0 n ) = 0 and 

/V A 

b ( © ) = 1/(0 + 1) . The empirical Bayes estimate of 0 

n n ^ J n 

is therefore given by 


^ P (0 + 1) 

(0 + 1 ) n n 


n 


p (e ) 

n n 


(7.5) 
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Clearly, P n (0 n ) can ' 3e estimated from the sequence of 
observations 0,, 0„, 0 . Although 0 Is the 

unique estimator obtained from family F , it is 
precisely the estimator Robbins [26] used to introduce 
the empirical Bayes situation. 

For the Poisson distribution, the estimators proposed 
by Lemon and Krutchkoff [17] can be represented by 


and 



(7.6) 



Monte Carlo simulations for the estimators 0 , 

^ ^ p 5 

/\ j 

0 p , and 0 p were conducted in a manner analogous to 
the procedure described for 0 Q in section 3.5. As in 
section 3.5 the criterion for comparison was mean- 
squared error, and therefore the ratio R defined by 
(3.25) was of interest. Here we denote this ratio by 



171 


- ^ /N 

* ~ ^ I * 

R^ , R , and R^ when calculated with 0„ , 0„ , 

P > P > p p 5 p 5 

X t 

and 0 p respectively. The ratio R calculated with 

the smooth estimator 0 given by (3*22) will be 

P jr v 

denoted by R p v • 


^ r 

In Figures 50-52 the values of R p and R p v , 
represented by the broken and solid lines respectively. 


are plotted as a function of the number of experiences. 


These plots are given for various values of the summary 

^ T 

quantity Z given by (3.27). The estimator 0 p was 
observed to provide uniform squared-error improvement 


over the estimator 0 p ; therefore the ratio R p is not 

* 

shown. Also, the ratio R p is omitted since for 
0.5 ^ Z < 5.0 and n < 20 , the maximum-likelihood 


estimator gave smaller mean-squared errors than the 

* 

estimator 0 p regardless of the form of the prior 

distribution. By comparing the values of R with 

* » 

R p in each of the figures, it can be seen that the 

continuously smooth estimator provides consistent mean- 

/\ 

^ ? 

squared improvement over the estimator 0 . Hence, 

A 

0 p provides consistent improvement over 0 p and, 

* 

of course, over 0 p . 


7 . 3 The Weibull Distribution with Known Shape Parameter 


Let us assume that in each experiment, a random 


sample of k observations x = ( x. , x„ , • • • ,x. ) is 

^ 1 2 k 
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obtained from the Weibull density function given by 
(4.1). In section 4.1, the sufficient statistic 

k g 

T = XI x i formed with this sample was shown to have 

• -I J 

J = 1 

the conditional gamma density function given by (4.15). 
Differentiating this function f(t|a) with respect to 
T and dividing by f(t|a) we obtain 


3f (t | a) 

3t 

f (t I a) 


-a + 



(7.8) 


The gamma distribution is therefore a member of family 
F 2 , and a consistent estimator for a n can be given 

by 


a 


* 

G 




f (t ) 
n n 


(7.9) 


The consistent estimators f (t ) and f'(t ) chosen to 

n n n n 

represent f(t) and its first derivative are given by 


and 



( 7 . 10 ) 


f (t ) 
n 


f (t 
n n 


+ h) - f (t ) 
n n 

h~ 


(7.11) 


respectively . 
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The estimators of Lemon and Krutchkoff [17] 

£ » 

and a Q can be constructed using as the kernel of 
Integration the Inverted gamma density function given 
by (4.16). These estimators become 


and 



( 7 . 12 ) 



where or (i = l,2,***,n) Is the maximum-likelihood 
estimate from the 1th experiment. 
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Monte Carlo simulations for the empirical Bayes 

* £ £ t 

estimators a , a , and a were conducted in a 

n * rz J n 


manner analogous to the procedure described for 0 D 

/s. 

* £ ^ t 

in section 3.5* The ratios R , R , and R 

G G G 

* £ 

defined by (3.25) were calculated with a 3 ot 3 

G G 

£ t 

and a respectively. 

G 


* ^ i 

In Figures 53-55, the values of R_ , R_ , and 

G G 

R j denoted by the dotted, broken, and solid lines 
G f V 

respectively, are plotted for various values of Z 
defined by (4.44). The ratio R„ was calculated 

G / V 

with the continuously smooth estimator a_ given 

G ' v 

by (4.42). The ratio R has been omitted in each 

G 

figure since the results obtained with the estimator 


a were in all cases considered uniformly superior 

G 

when compared to the results obtained with a Q . In 

* 2 1 

each figure comparison of the plots of R , R , 

G G 

and R as a function of the number of experiments 

G / V 

demonstrates the significant improvements one can 
obtain using the continuously smooth estimator a 


G , V 


as opposed to a_ or a_ . Since a_ T7 provides 

G G G r V 

£ t 

consistent mean-squared improvement over a_ , it 

G 

also provides consistent improvement over a . 
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7 . 4 The Weibull Distribution with Known Scale Parameter 

Let us assume that In each experiment a random 
sample of k observations is obtained from the Weibull 
density function given by (5.1). The maximum-likelihood 
estimator for the shape parameter 6 is found by the 
solution of (5.3). The distribution of this estimator, 
given any value of the parameter 3 , has an asymptotic 
normal distribution with mean 3 and variance given 
by (5.12). This distribution can be used to form the 
empirical Bayes estimators 


and 



(7.14) 



(7.15) 
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The estimators given by (7.1*0 and (7.15) are of 
the form proposed by Lemon and Krutchkoff [17]; however. 


they are based on an asymptotic distribution rather 
than on the true distribution. This result represents 


a natural extension of their estimators and will be 
treated accordingly. We remark here that the Weibull 
distribution with known scale parameter cannot be 
placed into any of the families of Rutherford and 
Krutchkoff [28]. 

/\ 

/v 

Monte Carlo simulations for the estimators 8 N 

/s 

^ t 

and $ N were conducted in a manner analogous to the 
procedure described for 0 D in section 3.5. The ratios 

A /\ t 

R n and R n defined by (3.25) were calculated with 

£ * t 

3 n and 8 N respectively. 


At 

In Figures 56-58 the values of R„ and R„ „ 
denoted by the broken and solid lines respectively are 
plotted for various values of Z as defined by (5.25). 
The ratio R N v was calculated with the continuously 
smooth estimator 8 N given by (5.23). As in the 

£ t 

above cases, it was found that the estimator 8. 

N 

provided significant squared-error improvement over 

/s 

/\ 

the corresponding estimator 8 N , and therefore the 

ratio R n has been omitted in each figure. Comparison 

^ ' 

of the plots of R and R„ „ 

N N , v 


in each of the 
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Figures 56-58 demonstrates the significant improvement 
one can obtain using the continuously smooth estimator 

~ £ t £ » 

p n ^ v as opposed to 3 N . Since 3 N was observed to 

£ 

give uniform improvement when compared with B N , the 

estimator 3„ „ is also more efficient than 3 V . 

w # v N 


















CHAPTER VIII 


CONCLUSIONS AND RECOMMENDATIONS 

The purpose of this chapter is to summarize the 
results of this dissertation and to suggest directions 
for future research. 

8 . 1 General Conclusions 

A new method of empirical Bayes estimation has 
been presented. The versatility of the method has been 
demonstrated for estimating the parameters of several 
distributions. The estimator has also been shown to 
be more efficient in a mean-squared sense than other 
well-known and widely used empirical Bayes estimators. 

The continuously smooth empirical Bayes estimator 
was developed in Chapter II. The estimator was pre- 
sented in a general form applicable to multivariate 
estimation problems. The estimator allows the researcher 
the use of present as well as previous information 3 as 
opposed to classical estimation techniques which must 
restrict the researcher to the use of only present or 
current information. The past information is in the 
form of classical estimates of other parameters in 
similar but independent experiments. By "similar" we 
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mean that there exists a common prior distribution of 
the parameter vector, but it remains forever unknown. 

The smooth estimator was obtained by representing 
the prior density function in the Bayes estimator by a 
continuous approximation formed from a sequence of 
classical estimates. This approximation was based on 
three general assumptions. They are as follows: 

(i) The prior distribution has a continuous density 
function . 

(ii) The classical estimator is both sufficient and 
consistent for estimating the parameter vector. 

(iii) The distribution of the classical estimator 
is known. 

In practical situations a noncontinuous or discrete 
prior density would sometimes seem to be an unlikely phe- 
nomenon. If, however, the prior density was noncontinuous, 
perhaps it could be approximated by a continuous density 
function. The effect such an approximation would have on 
the smooth estimator is a subject for future research. 

In general most classical estimators can be shown 
to be consistent. In particular, the widely used 
maximum-likelihood estimators are consistent under 
general regularity conditions. The assumption of 
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consistency was used to prove that the marginal density 
function of the classical estimator converges in proba- 
bility to the prior density function as the sample size 
tends to infinity. Experimental results revealed that 
the smooth estimator gave improved results for small 
sample sizes. Thus it is conjectured that the assump- 
tion of consistency represents a property which in 
practice can be relaxed. If a classical estimator 
provides "good" results, then a smooth estimator based 
on it should also give "good" results, regardless of 
the consistency property. The sufficiency property can 
also be relaxed. This was demonstrated in Chapters V 
and VI. Therefore it is conjectured that the second 
assumption represents an unnecessary restriction in many 
practical applications. 

The third assumption may also be viewed as a general 
restriction. When the true distribution of the classical 
estimator is unobtainable, its asymptotic distribution 
may be known. This distribution can then be used to 
form a continuously smooth empirical Bayes estimator. 

In Chapters V and VI this technique was employed, and 
significant squared-error improvement' over the classical 
maximum-likelihood method was achieved. 

In all cases considered in the Monte Carlo studies 
of the preceding chapters, the continuously smooth 



190 


empirical Bayes estimators uniformly provided mean- 
squared improvement over the maximum-likelihood 
estimators. The smooth estimators were also observed 
to be robust to the form of the prior distribution. 

For all types of Pearson prior distributions with varying 
coefficients of skewness and kurtosis, the ratio R of 
empirical Bayes mean-squared error to maximum-likelihood 
mean-squared error was observed to be significantly 
influenced by the prior distribution only through a 
quantity Z . Apart from the number of experiences, 
the only quantities affecting the ratio R are contained 
in Z . Therefore this quantity was conveniently used 
to summarize and index the amount of improvement achieved 
by the smooth estimators over the maximum-likelihood 
estimators. In particular as the value of Z increased, 
the ratio R decreased. This phenomenon is easily 
explained. As the variance of the maximum-likelihood 
estimator, given the corresponding parameter value, 
increases relative to variance of the prior distribution, 
the maximum-likelihood estimates will vary widely. The 
smooth estimators, however, are capable of "detecting" 
this variation and can use this information to obtain 
improved estimates. Conversely, if the conditional 
variance is small as compared to the prior variance, 
then the maximum-likelihood estimator would be expected 
to do quite well. In this case there is a great deal of 



191 


Information within an experiment, and previous experi- 
ments contribute very little information about the 
parameter . 

In the preceding chapters whenever point estimates 
were required for distributions conditional on only one 
parameter, four smooth estimators were considered. They 
are: (1) an estimator 0 D whose prior density approx- 

imation is based on a sequence of classical estimates, 

~ f 

(2) an estimator 0 D whose prior density approximation 
is based on a sequence of smooth estimates obtained from 
© , (3) an estimator 0 whose prior density approx- 

■U U f V 

imation is based on a sequence of transformed maximum- 
likelihood estimates having a marginal distribution 
whose mean and variance are approximately equivalent 
to those of the prior distribution, and (4) an estimator 

f 

0 D v whose prior density approximation is based on a 
sequence of smooth estimates obtained from 0 . Of 

U / V 

these estimators, the smooth estimator 0 D v was • 
observed to be the most efficient in a mean-squared 
sense. This result is expected. The improvement 
achieved by the smooth estimators was observed to be a 
function of the mean and variance of the prior distri- 
bution as demonstrated by the summary quantity Z ; 
thus any consideration given to accurately estimating 
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these moments by the prior density approximation should 
result in squared-error improvement. 

The transformation given in section 2.5, on which 
the smooth estimators 0 D v are based, was only applied 
to the maximum-likelihood estimators. It is, however, 
not just restricted to this method of estimation. The 
transformation can be applied to any classical estimator 
whose distribution is known. 

In Chapter VI smooth estimators were obtained for 
the parameter vector y = (a, 3) in the two-parameter 
Weibull distribution. Two distinct types of smooth 
estimators were considered. The first type denoted by 
a„ and respectively were constructed in a manner 

analogous to that given in Chapter II. These estimators 
were based on an approximation to the bivariate prior 
density function. No dependence assumptions on a or 
6 were required. The estimators were, however, based 
on the assumption that the maximum-likelihood estimators 
are jointly sufficient for a and 3 . This is not 
known to be true. The second type of smooth estimators, 
denoted by a M and S M respectively and referred to 
as marginal empirical Bayes estimators, was based on the 
assumptions that: (1) the maximum-likelihood estimators 
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are Independently distributed and are marginally suf- 
ficient for a and 3 respectively, and (2) the 
parameters a and 3 are Independently distributed. 

Under these assumptions, the Bayes estimators for a 

. A v\ 

and 3 become E(a|a) and E ( 3 j 3 ) respectively. In the 
case of the Weibull distribution, however, neither 
assumption can generally be made. 

In section 6 . 3 we observed that even when a and 

3 are highly correlated, the estimators oL and 3„ 

MM 

gave "better" results than could be obtained using a N 
and 3 n . It Is conjectured that the compounding effect 
of being unable to accurately approximate bivariate 
densities and of having to perform double integration 
over regions approximated from sample data causes this 
phenomenon. The error introduced by such approximations 
appears to be more significant than that caused by the 
false assumptions of independence and sufficiency. 

In the Monte Carlo studies of the preceding chapters, 
the squared-error improvement achieved by each of the 
smooth estimators over that of the classical maximum- 
likelihood estimators was observed to reach its maximum 
gradient during the initial 10 experiments. For example 
in Figure 12 for a Z value of 2.0, the smooth estimator 
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a achieves 3^ percent improvement over the maximum- 

G 7 V 

likelihood estimator after the second experience and 
gains 31 percent more improvement through the tenth 
experiment. From the tenth to the twentieth experiment, 
only a 7 percent gradient is noticed. This result tends 
to indicate that in practice an accumulation of more 
than 10 sets of data may be unnecessary. The amount of 
labor required to obtain additional data may not be 
worth the slight increase in improvement. 

In Chapter VII the continuously smooth empirical 
Bayes estimator was shown to be significantly superior, 
over 20 experiences, to the estimators proposed by 
Rutherford and Krutchkoff [28] and the step function 
estimators evolved from the method of Lemon and 
Krutchkoff [17]. The estimators of Rutherford and 
Krutchkoff are e-asymptotically optimal; however their 
mean-squared errors appear to be much larger than those 
of the continuously smooth estimators or the estimators 
of Lemon and Krutchkoff. It is the author’s opinion 
that although bypassing explicit estimation of the 
prior distribution may be theoretically desirable as 
well as convenient, in practical application such 
procedures are undesirable. As demonstrated in 
Chapter VII they not only gave poorer results than the 
explicit estimation procedures, but were limited in 
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application. In particular, the Weibull distribution 
with unknown shape parameter, considered in Chapter V, 
could not be placed into any of the families proposed 
by Rutherford and Krutchkoff. Furthermore, attempts by 
the author to obtain an empirical Bayes estimator which 
does not require explicit estimation of the prior 
distribution were unsuccessful. 

£ * , 

The empirical Bayes estimators 0 D and 0 Q 
suggested by Lemon and Krutchkoff and given by (7.3) 
and (7.4) respectively were observed to be more efficient 
than those of Rutherford and Krutchkoff; however they 
were not as efficient as the continuously smooth esti- 
mators. In particular the estimator 0 was 

U / V 

observed to be significantly superior, over a run of 

a * , 

20 experiences, to the estimators 0 D and 0 D . It is 
conjectured that the observed improvements obtained by 
the smooth estimators result from the continuous prior 
density approximation being less sensitive to the 
estimates than is the discrete approximation to the 
prior distribution. 

8 . 2 Areas for Future Research 

Application of the continuously smooth empirical 
Bayes estimator to multivariate density estimation will 
be cumbersome if an electronic computer is unavailable. 
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Even with the aid of a computer the numerical solution 
of multiple integrals could prove frustrating to the 
researcher and he may abandon the method for one of 
lesser precision. Thus a method to avoid this annoying 
numerical integration would significantly enhance the 
practical value of the smooth estimation technique. 

This method would not be generally applicable since it 
would depend on the kernel of integration. It may be 
possible, however, to obtain closed-form solutions for 
the integrals for certain families of distributions. 

Such solutions may be accomplished by considering various 
forms for the prior density estimator. The sine function 
has been chosen in this dissertation, although many 
other forms are given by Parzen [2 4] for univariate 
estimation and by Martz [19] for multivariate estimation. 
Currently such investigations are being conducted at 
Texas Tech University, and initial results are 
encouraging. 

Another lucrative area for research lies in 
Experimental Design. Since empirical Bayes procedures 
allow the researcher the use of current as well as past 
information, he should design his experiments with these 
procedures foremost in mind. Presently, however, such 
designs are virtually unavailable. The number of 
experiments and the sample size, which should be used 
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in order to obtain a certain level of performance, are 
subjects for further research. If such designs could 
be obtained, then the total number of items subjected to 
testing procedures could be drastically reduced. For 
example if the total number of rocket engines subjected 
to firing tests in an engine development program could 
be reduced, then a substantial reduction in cost would 
be realized. 

A general field of research would be the application 
of the smooth estimator to various practical problems. 

A particularly suitable field of application that is 
indicated by the research described in this dissertation 
lies in Reliability and Maintainability Engineering. 

This suitability is twofold; first, the empirical Bayes 
approach frequently lends itself to the testing situations 
encountered in this field, second, the two-parameter 
Weibull distribution considered in the preceding chapters 
is a highly versatile and widely used time-to-fai'lure 
distribution. Presently research of this nature is 
being conducted at Texas Tech University. 

8 . 3 Conclusion 

The purpose of the research described in this 
dissertation was to develop and exploit a new method 
of empirical Bayes estimation. This purpose has been 



198 


accomplished. The continuously smooth estimator has 
been shown to provide significant Improvements over both 
the classical maximum-likelihood method and other well- 
known and widely used empirical Bayes methods. In 
addition, the results of this research constitute a 
foundation upon which future applications can be based. 
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APPENDIX A: GAUSS QUADRATURE 


An accurate formula for finding the value of the 
definite integral 


I = 



( A . l) 


where f(x) is a known function, but whose integral is 
either not easily evaluated or cannot be conveniently 
expressed in closed form, was derived by Gauss and is 
based on Legendre polynomials. The procedure is to 
obtain the subdivision of the interval (a,b), the value 
of the function at these points, and the coefficients 
to multiply the functional values to yield the value 
of the definite integral. 

First we transform the interval x = (a,b) into 
the interval t = (-1,1) by letting 

x = i (b - a)t + | (a + b) 


The new form of f(x) is 


f ( x ) 



(b - a) t + ( a + b ) 


<Kt) 
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and 


dx = 75- (b - a) dt 


so that 


/’ 


f(x) dx = — 0 -- | $(t) dt 


( 


Using the Gauss mechanical quadrature formula 


/•-i- n 

- / 

V 

I $(t) dt = 



1-1 k=l 

- 

/ 


(n = 2 , 3 , * * * ) 


where n is the number of points of subdivision of the 
interval (-1,1), A^ the weighting coefficients, and 

K 

( n ) 

t/; J the zeros of the Legendre polynomials of degree 
n , we have 


I 

n 


n 



k=l 


a) 


t (n) 

k b + a 

2 2 


(n = 2 ,3,* * *) 


The A^ n ^ and t^ n ^ 
k k 

to t = 0 , that is. 


are symmetric with respect 


A (n) _ A (n) , (n) 

k n-k+1 3 °k 


(n) 

n-k+1 
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( n ) 

A table of the zeros t' of the Legendre polynomial 

( n ) 

of order 1-16 and the weight coefficients for 

the Gauss mechanical quadrature are given by Lowan 
et al. [18]. In Table A1 we have reproduced these values 
for the special cases when n = 3 and n = 11 . 

In particular, in the subroutine GAUSS, which was 
used to solve the integrals of the smooth estimator, 
n was taken to be 11. Also a "built in" accuracy 
check is made. The value of the integral given by (A.l) 
is first calculated; then the integral 


b-a 

, f 2 rb 

I = I f ( x ) dx + I f ( x ) dx 

•'a Jb-a 

2 

is computed. If <5 < e , where 


I - I 


I 


1 1 * - 1 1 if |l’ | < £ 


and e is the desired tolerance given by input, then 

i 

the value I is returned as the value of the integral. 
If 6 > £ then the range (a,b) is partitioned into 

n 

four subintervals and the sum I of the integrals over 

?! 

each interval is computed and 6 is formed with I 


if 1 1 1 > £ 
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TABLE A1 

GAUSS’S QUADRATURE COEFFICIENTS 


k 

t. 

A, 


k 

k 


n = 3 


1 

0.77459 

66692 

0.55555 

55556 


2 

0.00000 

00000 

0.88888 

88889 


n = 11 

1 

0.97822 

86581 

0.05566 

85671 


2 

0.88706 

25998 

0.12558 

03695 


3 

0.73015 

20056 

0.18629 

02109 


4 

0.51909 

61291 

0.23319 

37646 


5 

0.26954 

31560 

0.26280 

45445 


6 

0.00000 

00000 

0.27292 

50868 


t 

and I 

. If 6 < e 

IT 

then I is 

returned as the 

value 

of the integral. If 

<5 > £ then 

the region 

( a,b ) 

is 

again subdivided. This procedure 

is repeated 

IT 

times 


a value given by input. If convergence is not reached 
after IT subdivisions 3 an error message is printed and 
the program terminated. 



APPENDIX B: NEWTON'S METHOD 


The well-known Iterative procedure of Newton can 
be easily applied to the solution of the nonlinear 
maximum-likelihood estimating equations of Chapters V 
and VI. This procedure can be obtained by truncating 
the Taylor series expansion after two terms and has 
the form 


f (x ) 

x = x (i = 0,1,**«) . (B.l) 

11 1 f (x.) 

Convergence to the root is quadratic if the multipli- 
city of the root to be determined is equal to one and 
if f(x) is a twice— differentiable function. 

The subroutine RTNI based on (B.l) and given in the 
System/360 Scientific Subroutine Package was used to 
solve for the maximum-likelihood estimators. In this 
subroutine the iterative procedure is terminated if the 
following two conditions are satisfied: 


6 < e and |f(x + 1) | < lOOe 
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and tolerance e given by input. If the procedure 
does not converge within a specified number of iteration 
steps, an error message is given. For further details 
on the method as well as reasons for divergence, see 
Hildebrand [ 13 ]. 
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